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PROTEINS INVOLVED IN THE QvwTurc.c 

ANTIGEN - P ™S.~^ MBly ° F 

pan,cul, riy a pr<>blem jn dmduais. p. ^,, glmm ^ 

"•Mutt ,„ many MibiMcs >' ■ "*»>• H» organic, te inw,,^, 

-op„, yMcchandes |n ^" *™ " "°'° Ki " K "'° te "«- — 

O a„«i gen KpeMng ^ b —» «-*— of lipid A-cc™ oligopeptide. 

-a—I by the majoiity of „ aeruginosa seto^^d h " ' C ° mm0n ' 0tm 

<~g «^ and u . „e t . ropo)ymer compo ^ V ; 

conning , wide variwy of P * *• •» penta., a ech„ide ^ 

5 b '"" « attached „ lipid a co """" ^ ^ "» A - - 

(Riven, et .1.. lnz) , a ™ M '° «« ""-position o( ^ ^ ^ 

' ™~»* * g«« such as "T" 0 - chiding s „ me fr „ m 

Muthari.. 19 o 5; Comstock „ a , ^ «— , e, a ,.. 19 o 2; Amor „ d 

^ "' d •"""""■»"'»«■•' (Kingsley et ai. W3) 

- = r 



PCT/CA97/00295 

WO 97/41234 

-2- 

differ signific.rn.ly in *dr r/» dus,ers. while unique O-anugens can be encode by on.y 
slighny varum. r/» genes in omer strains (Whi.field and Valv.no. 1993). 

Lightfoo, and Lam were lha firs, to repor. .ha cloning of genes 
Mvolved in the expression of A-band (Ughtfoo. and L» 199!) -id B-band (Lightfoo, : 

5 Lam. 1993, LPS of 1>. mn*~- A cosmid done pFV3 cor.ptemen.ad A-band 

LPS synthesis in an A-band-deficien. mu«n, rd7513. pFV3 also media«d A*.nd LPS 
£L in five o, m. * P- ° which lac* A-band LP S Ano^ 

lamid clone pFVlOO. complement B-band LPS synthesis in mu.an. ga6. wh.ch lacks B 
3 ^Sical mappJg of .he genes involved in A-band and B-band LPS syn.hesis 

,0 *dic..ed ma. ft. W o gene clus.ers a~ physically divine, and are sep»ra,ed by mo^«han 
1.9Mbpon,heP..^PA01gen^.A^ 

Mhp U «o l3.3 min,. and Wand LPS genes mapped a. IS Mbp (near 37 mm, on ,he 5.9 
Mbp chromosome. ^ ^ ^ ^ ^ ^ ^ ^ ^ 

15 (Knirel « al.. 1988,. 05 has a .risaccharide repaying uni. o, 2-aceUun.dc, 

U-dideoxy-D-mannuronic acid. ^diacaumido-D-mannuronic ac.d. and N-ace y D 
K^osam in. <Hgure 30, Serotypes C 2 . OK. 018. and O20 o, , ~ » °" 

«. serow OS. varying only in one linkage or one epuncr from OS <Kn,re. .aU 
^88, (Hgure 30, Immunochemical cnoss reacions have a!so been demons.ra.ed among LPS 

referred .o as >* ' and "mbpt ") from me OS gene c,us,e, has b^en 
dTac.eriz.d <Dasgup.a and Lam, 1995,. This OS 0-an«igen biosyn.he.ic gene has been 
^HTJ*. only wHh Chromosomal D NA from .he group o, five ser„.ypes w..h 
cimilar O-antieens, and not with the remaining fifteen serotypes. 

sunnar O ant.gens, ^ ^ ^ ^ and 

assembly of LPS, the Rfc-dependent and Rfc-independent pathway, Rfc is the O-antigen 
assembly ' f „ assembly of het eropolymeric O-anUgens 

polymerase, and appears q o-anti K ens appear to be assembled 

(Makela and Stacker, 1984). In contrast, homopolymenc O-anhgens appe 

1 t an O antigen polymerase (Whitfield, 1995). Rfc-dependent (or Wzy) LPS 
T K bl shown to involve at least two other gene products which act in concert 

units across the cytoplasmic membrane where they are polymenzed by Rfc (or Wzy), a 
rZTzz) the regulator of O-antigen chain !en g th, which determines the pre erre O- 
llgllin length characteristic of the .dividual stra, or serotype (Batchelor et a, 
35 1993; Bastin et al., 1993; Morona et al., 1994b; Dodgson et al., 1996). 
TVMM/.FY r** TtTE INVENTION. 
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known as wbpL ,„ d rFA nfcsgupta and Um 199S1 ' " ^ ** ' SW - ge ™ ab ° 
-Ml -Ml. ** and wbpK respechjy Th-enlt H 

-n.ua, the novel genes ,s>Ma„d psbN ^ * £~ ^iTT f ' 002 ° 
«*,M and „spectively (- Graip „ _ *" ™ """"" '° herei " » 

which a,e „„, ta Jl' ' ^ 8CTe duster **» stains genes 

•nvoiveo in LPS synthesis including the eenes posA .„J t,_r, j . 
genes designated u„B. u^ elelMn , ^ I£T^. " d ~ ' 

» 6— in^^geneCusterissnowninHg^,. — I—, of the 

The identification and sequencing of the een».»„H . . . 

gene duster permits „» identUi „ Bon of J^JJ^ ^ *""-» " ««* 
assembly in p. -n,^, . h . °™*V> synthesis or 

^ oivea in the synthesis, and assembly of lipopolysaccharide in p 

the genes *,s H J d Ws , ^ T 
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The present invention also relates to nucleic acid mni., i 
the following proteins: (1, (a ) Ro , (also known as Wzz)- PsbA ^ 

PsbB (also known as WbpB)- (d) P sbC raise k " (C) 

~ P W <also ^own as WbpC); ( e ) PsbD (»\^ u 

WbpD); (f) PsbE (also known as WbpE)- f e ) Rf c (akn v ( knmm 35 

as WbpF)- (i) PsbG falsa k Vl " ^ {h) PsbF < also 

PU. (i) PsbG (also known as WbpG); (j) P sb I (also known as Wbpl)- / kl PcM , , 

known as WbpJ); (,) PsbK (also known as WbpK)- ( m ) PsbM (also v , ' ' 

(also known as WbpH) or (o) PsbN false v V " W PsbH 

P J (o) PsbN (also known as WbpN). involved in P. aeruginosa O- 
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antigen synthesis and assembly; (2) UvrB involved in ultraviolet repair; (3) HisH or HisF 
involved in histidine synthesis, or (4) RpsA a 30S ribosomal subunit protein SI. In addition, 
nucleic acid molecules are provided which contain sequences encoding two or more f the 
following proteins (1) (a) Rol (also known as Wzz); (b) PsbA (also known as WbpA); (c) PsbB 
5 (also known as WbpB); (d) PsbC (also known as WbpC); (e) PsbD (also known as WbpD); (f) 
PsbE (also known as WbpE); (g) Rfc (also known as Wzy); (h) PsbF (also known as WbpF); 
(i) HisH; (j) HisF; (k) PsbG (also known as WbpG); (1) Psbl (also known as Wbpl); (m) PsbJ 
(also known as WbpJ); (n) PsbK (also known as WbpK); (o) PsbM (also known as WbpM); (p) 
PsbN (also known as WbpN); (q) PsbH (also known as WbpH); (r) PsbL (also known as 

10 WbpL); and (s) RpsA. 

The invention also contemplates a nucleic acid molecule comprising a 
sequence encoding a truncation of a protein of the invention, an analog, or a homolog of a 
protein of the invention, or a truncation thereof. 

The nucleic acid molecules of the invention may be inserted into an 
15 appropriate expression vector, i.e. a vector which contains the necessary elements for the 
transcription and translation of the inserted coding sequence. Accordingly, recombinant 
expression vectors adapted for transformation of a host cell may be constructed which 
comprise a nucleic acid molecule of the invention and one or more transcription and 
translation elements operatively linked to the nucleic acid molecule. 
20 The recombinant expression vector may be used to prepare transformed 

host cells expressing a protein of the invention. Therefore, the invention further provides 
host cells containing a recombinant molecule of the invention. 

The invention further provides a method for preparing a protein of the 
invention utilizing the purified and isolated nucleic acid molecules of the invention. In an 
25 embodiment a method for preparing a protein of the invention is provided comprising (a) 
transferring a recombinant expression vector of the invention into a host cell; (b) selecting 
transformed host cells from untransformed host cells; (c) culturing a selected transformed 
host cell under conditions which allow expression of the protein; and (d) isolating the 
protein. 

30 The invention further broadly contemplates an isolated protein 

characterized in that it has part or all of the primary structural conformation (ie. 
continuous sequence of amino acid residues) of a novel protein encoded by a gene of the wbp 
gene cluster of the invention. In an embodiment of the invention, a purified protein is 
provided which has the amino acid sequence as shown in Figure 3 or SEQ ID NO:2;, Figure 4 

35 or SEQ ID NO:3; Figure 5 or SEQ ID NO:4; Figure 6 or SEQ ID NO:5; Figure 7 or SEQ ID 
NO:6; Figure 8 or SEQ ID NO:7; Figure 9 or SEQ ID NO:8; Figure 10 or SEQ ID NO:9; Figure 
11 or SEQ ID NO:10; Figure 12 or SEQ ID NO:ll; Figure 13 or SEQ ID NO:12; Figure 14 or 
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" T, h or T 8 ,D NO:17; **- 15 or SEQro - N •= » «■ «*- » - «^ 

isoforms of the protein and truncations thereof. 

5 ^Proteins of the inventon may be conjugated with other molecutes 

sue .as proteins, to prepare fusion p^s. 7*is may he accomplished, for 
syntl^ 1S ofN-termir^„rC-terr«inalf„ s ic > „ pn) ,a fa 5. *»">Ple. by the 

art*, . ^ " UCle ' C ^ molecuks "'^ "ve«i„„ allow tt,o* skilled in the 

r, te ™ nudeotide pro^ , or use in the detection „, nucleotide seances in sampts 

such as b.o.og.cal ,e. g cnniea, specimens,. fo*. or environr^ta, m ^ lm . We nodeo ^ 

probes may also be used to detect nuclide se,„ences that encode pLeins related"! 

analogous to the proteins of the invention. ems related to or 

o»„ , A "<"<""8ly. the invention provides , method for detecting the 

presence of a nucleic acid molecule having a seouence encoding a proto „ of the invent 
c„rnp, lsm conUcti „g the sample with , „„ d e„,i de probe which ^ J££ 
nuctetc acd moiec*, to form . hyo^^ produc , „„„„ * ~ ^» £ 

formahon „, the hyondizanon product, and assaying for the hybridization £ZT 
nucleic acid moi^T Pr ° V " te * *' *" «« P"»»<* »' a 

chain reaction (PCR). «"-p'e m ine polymerase 

Accordingly, the invention relates to a m«>ch«H j a 
sampie. compnsmg treating the S amp,e with pnn , ers which ,„ 

nuc,e.c acid moiecule in .„ amplication reaction, preferabiy I . p„ ^^l^ 
reacnon. to form amplified seouences. under conditions which permi, ZZZtt , 
amphfied seouences. and. assaying ,„, mpKRal sequen£es " 

The invention .further relates fn a Hf f ~ a * 
nuceic acid moteeute having . seouence enc^dm a pro, I o TZZLT " ' 

comprising prim „ s which „ ^ „ ( ^ ~ 

ampliation reaction, prefer.,,, a po.ymerase cha,„ reaction ,„ £T r, ", 
sequences, reagents req ui,ed for ampuiying the nucieic acid mole:,: 0 ,:™::: 
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amplification reaction, means for assaying the amplified sequences, and directions for its 
use. 

The invention also relates to an antibody specific for an epitope of a 
protein of the invention, and methods for preparing the antibodies. Antibodies specific for a 

5 protein encoded by a Group I gene can be used to detect P. aeruginosa serotypes 02, 05, 016, 
018, and O20 in a sample, and antibodies specific for a protein encoded by a Group II gene 
can be used to detect P. aeruginosa serotypes Ol to O20 in a sample. Therefore, the invention 
also relates to a method for detecting P. aeruginosa serotypes 02, OS, 016, 018, and O20 in a 
sample comprising contacting a sample with an antibody specific for an epitope of a protein 

10 encoded by a Group 1 gene which antibody is capable of being detected after it becomes 
bound to a protein in the sample, and assaying for antibody bound to protein in the sample, 
or unreacted antibody. A method is also provided for detecting P. aeruginosa serotypes Ol to 
O20 in a sample comprising contacting a sample with an antibody specific for an epitope of 
a protein encoded by a Group II gene which antibody is capable of being detected after it 

15 becomes bound to a protein in the sample, and assaying for antibody bound to protein in the 
sample, or unreacted antibody. 

A kit for detecting P. aeruginosa serotypes in a sample comprising an 
antibody of the invention, preferably a monoclonal antibody and directions for its use is also 
provided. The kit may also contain reagents which are required for binding of the antibody 

20 to the protein in the sample. 

As discussed above, the identification and sequencing of genes in the 
wbp gene cluster in P. aeruginosa permits the identification of substances which affect the 
activity of the proteins encoded by the genes in the cluster, or the expression of the proteins, 
thereby affecting O-antigen synthesis or assembly. These substances may be useful in 

25 rendering the microorganisms more susceptible to attack by host defence mechanisms. 
Accordingly, the invention provides a method for assaying for a substance that affects one 
or both of P. aeruginosa O-antigen synthesis or assembly comprising mixing a protein or 
nucleic acid molecule of the invention with a test substance which is suspected of affecting 
P. aeruginosa O-antigen synthesis or assembly, and determining the effect of the substance 

30 by comparing to a control 

Other objects, features and advantages of the present invention will 
become apparent from the following detailed description. It should be understood, 
however, that the detailed description and the specific examples while indicating 
preferred embodiments of the invention are given by way of illustration only, since various 
35 changes and modifications within the spirit and scope of the invention will become 
apparent to those skilled in the art from this detailed description. 
BRIEF DESC RIPTION OF DRAWINGS 



WO 97/41234 

PCT/CA97/00295 

-7- 

The inventi n will now be described in relation to the drawings- 
genecluster; F,8Ure ' ^ ^ ° rgani2ati0n ° f ^ * PAO > ™* 

c , FigUrc 2 Sh ° WS nUClek acid ^ u «nce of the P. aeruginosa PAOl «„p 

5 cluster (SEQ. ID. NO. 1); -rugmosa I AOl gene 

Figure 3 shows the amino acid sequence of n i 
invention (SEQ. ID NO. 2); ° J Pr ° tem ° f 1,10 

,K R8Ure * ShoWStheaminoa ^ ^quence of the PsbA (Wb pA ) protein of 

the invention (SEQ. ID NO. 3); protein of 

10 fh • R ^ re5ShOWS * ea ^° add ^^ 

the invention (SEQ. ID NO. 4); protein of 

Figure 6 shows the amino acid sequence of the P«hr rwn. n 
the invention (SEQ. ID NO. 5); ( PQ Pr °* ein °' 

15 th . R8Ure 7 Sh ° WS 3min0 3Cid Se « UCnCe of «he PsbD (WbpD) protein of 

15 the invention (SEQ. ID NO. 6); protein ot 

Figure 8 shows the amino acid sequence of the PsbF fWH„Fi . , 
the invention (SEQ. ID NO. 7); re w ^ PsbE (WbpE) protem of 

Figure 9 shows the amino acid sequence of the Rfc (Wzv) protein of th. 
invention (SEQ. ID NO. 8); protein of the 

20 

u, ■ FigUre 10 shows ** am ™ acid sequence of the PsbF (WbpF) protein of 

the invention (SEQ. ID NO. 9); P protem of 

Figure 11 shows the amino acid seauence of rt,„ w i_r 
invention (SEQ. ID NO. 10); ^ HuH Protem ° f ** 

Figure 12 shows the amino acid seauence of »h» w c 
25 invention (SEQ. ID NO. 11); sequence of the H.sF protein of the 

Figure 13 shows the amino acid sequence of the P<hr /wu ^ 
the invention (SEQ. ID NO. 12); ( P } PTOtem ° f 

Figure 14 shows the amino acid sequence of the PshH rwv, 
of the invention (SEQ. ID NO. 13); nce of PsbH (WbpH) protein 

8Ure ° WS th£ amino acid sequence of the Psbl fWboD nm( , 
the mvention (SEQ. ID NO. 14); P } P ° tein of 

Fi ^ rel6sh ° WS * eammoaddse q uen ccofthePsbI(VVbDn^ * 
the invention (SEQ. ID NO. 15); ( PJ) pr ° tem of 

« . Fi8UrC 17 8hows 1116 amino acid sequence of the PsbK (Wh^ . , 

35 the invention (SEQ. ID NO. 16); (WbpK) protein of 

FigUre 18 Sh ° WS amino acid sequence of the PsbM fWb nMl . • 
of the invention (SEQ. ID NO. 17); (WbpM) protein 
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Figure 19 shows the amino acid sequence of the PsbN (WbpN) protein 
of the invention (SEQ. ID NO. 18); 

Figure 20 shows the amino acid sequence of the UvrB protein of the 
invention (SEQ. ID NO. 19); 
5 Figure 21 shows the amino acid sequence of PsbL (SEQ. ID NO. 20) 

(WbpL); 

Figure 22 shows a silver-stained SDS-PAGE gel of LPS from PAOl, 
AK1401, AK14Ol(pFV100), and AK1401(pFV.TK8) (Panel A) and Western immunoblots of 
this LPS reacted with OS-specific MAb MF15-4 (Panel B); 
10 Figure 23 shows restriction maps of the chromosomal inserts from 

pFVlOO and several pFV subclones, and the results of complementation studies of the SR 
mutants AK1401 and rd7513 with the pFV subclones are also shown; 

Figure 24 shows a Southern analysis of the three rfc (wzy) 
chromosomal mutants, OP5.2, OP5.3, and OP5.5, showing the insertion of an 875 bp Gm R 
15 cassette into the rfc (wzy) gene (panel C), and restriction maps of the PAOl wild-type 
(panel A) and mutant (panel B) rfc (wzy) coding regions are shown; 

Figure 25 shows a silver-stained SDS-PAGE gel (panel A) and Western 
blots of LPS from PAOl, AK1401 and the three rfc (wzy) chromosomal mutants, OP5.2, 
OP5.3, and OP5.5 (Panels B and C); and 

20 Fi S ure 26 shows the restriction maps of recombinant plasmids pFV161, 

pFV401, and pFV402; 

Figure 27 are blots of Southern hybridizations of chromosomal DNA 
from PAOl (lane 2) and rol (wzz) mutants (lanes 3 and 4); 

Figure 28 are Western immunoblots showing the characterization of 
25 LPS from PAOl and PAOl rol (wzz) chromosomal mutants; 

Figure 29 is an autoradiogram showing 35 S-labeled proteins expressed 
by pFV401, which contains the rol (wzz) gene and corresponding control plasmid vector 
pBluescript II SK in E. coli JM 109DE3 by use of the T7 expression system; 

Figure 30 is a diagram showing the structures of the O-antigens of P. 
30 aeruginosa serotypes related to OS; 

Figure 31 shows E. coli a 70 and similar regions in psbA (wpbA), hisH, 
psbG (wpbG). 1S407 and psbN (wpbN); 

Figure 32 shows features of the psb genes of the psb gene cluster 
identifying the presumed start codon and spaces between RBS (ribosome binding sequence) 
35 and the first codon; 
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biosynthesis; ""^(Mm.^^,^ 

^^^^^^^^^^^ 

10 _ . „„ 

BS-OrfX. an d SB-R^C " S ' " ^ BP-Bp-D. EC-NfrC, 

TrsE; ^ " ^ ' '"•»—« *» >"A-Psb,. BP-BplE. » d V E . 

^ Figure 40 shows . for pA psbL ^ ^ hj 

C.pD; H8U " 41 5 " <>,, ' ! 3 " W *» »*M. TrsG, BP-Bp, L , » d 

Kgure 42 shows ft. nuclK>li(te o( ^ ^ 

F.g-43is ap h y sica, TOp o, [he5 . endofthewopchX ' 
proteins; ,8Ure ^ ' S ' C ° mpari5<,n * h >""°P^ 1** or s.1^ w «. lik . 

Kgur. 45 shows the expression of p. „ 

figure 46B shows a western immune*!,, using M .b 18-,o 

f .gu«46C shows . western taununobb. using Mab MFlsl. 

Figure 47 shows ,he abm* o, P. 05 w „ „ ^ ^ £ 

figure 4, shows fte amino add Md „ udeoUde ^ ^ ^ 

Figure 50 shows .he ^ acid „ d nucleo , ide ^ ^ ^ 
BEXAILED r,P«o,n^,„ f , rrT[[r t| , ( | f | 

the following stanidard abbreviation* („. .u 
used throughout ,he specification- A Al, , ^ a """° «" residues a,e 

* * • ~- - - ™ t — - «* 

glycine, H, His - histidine; I, He - 
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isoleucine; K, Lys - lysine; L, Leu - leucine; M, Met - methionine; N, Asn - asparagine; P, Pro 
- proline; Q, Gin - glutamine; R,_Arg - arginine; S, Ser - serine; T, Thr - threonine; V, Val - 
valine; W, Trp- tryptophan; Y, Tyr - tyrosine; and p.Y., PTyr - phosphotyr sine. 
|, Nvl» ir Af id M nlpcules of ihe Invention 
5 As hereinbefore mentioned, the present invention relates to an isolated 

P. aeruginosa B-band gene cluster containing genes involved in the synthesis and assembly of 
O-antigen in P. aeruginosa. The present invention also relates to the isolated genes which 
comprise the cluster. 

The term "isolated" refers to a nucleic acid substantially free of 
10 cellular material or culture medium when produced by recombinant DNA techniques, or 
chemical precursors, or other chemicals when chemically synthesized. The term "nucleic 
acid" is intended to include DNA and RNA and can be either double stranded or single 
stranded. 

The P. aeruginosa B-band gene cluster comprises the following genes: 
15 rol (wzz), psbA (wbpA). psbB (wbpB), psbC (wbpC), psbD (wbpD). psbE (wbpE). rfc (wzy). 
psbF (wbpF), psbC (wbpC), psbH (wbpW, psbl (wb P l), psbj (wbpj). psbK (wbpK), psbL 
(wbph). psbM (wbpM). and psbN (wbpW involved in the synthesis, and assembly of 
lipopolysaccharide in P. aeruginosa. The gene cluster may also contain the non-LPS genes 
hisH, hisF. himD, rspa, uvrB, and the insertion element IS407 (1S1209). 
2Q The genes preferably have the organization as shown in Figure 1 (SEQ. 

ID. NO. 1). In Figure 1, the genes necessary for sugar biosynthesis (Man(2NAc3N)A and 
Man(2NAc3NAc) biosynthesis) are scattered throughout the gene cluster (wpbl (psbl), 
wpbE (psbE), wpbD (psbD). wpbB (psbB), wpbC (psbC). The genes encoding transferases are 
interspersed throughout the wpb (psb) cluster {wpbH (psbW. wpbl <psb». wpbl, (wpbD), 
25 and are separated from one another by one gene each. The gene encoding the putative first 
transferase (Wpb (PsbL)), thought to initiate O-antigcn assembly by attachment of an 
FucNAc residue to undecaprenol, is the most distal. 

The invention provides nucleic acid molecules encoding the following 
proteins: (1) (a) Rol (Wzz); (b) PsbA (WbpA); (c) PsbB (WbpB); (d) PsbC (WbpC); (e) PsbD 
30 (WbpD); (f) PsbE (WbpE); (g) Rfc (Wzy); fh) PsbF (WbpF); (i) PsbG (WbpG); (j) Psbl 
(Wbpl); (k) Psbj (Wbpj); (1) PsbK (WbpK); (m) PsbM (WbpM); (n) PsbH (WbpH); and (o) 
PsbN (WbpN) involved in P. aeruginosa O-antigen synthesis and assembly; (2) UvrB 
involved in ultraviolet repair; (3) HisH or HisF involved in histidine synthesis or (4) 
himD involved in host factor integration and (5) RpsA a 30S ribosomal subunit protein SI. In 
35 addition, nucleic acid molecules are provided which contain sequences encoding two or more 
of the following proteins (1) (a) Rol (wzz); (b) PsbA (WbpA); (c) PsbB (WbpB); (d) PsbC 
(WbpC); (e) PsbD (WbpD); (f) PsbE (WbpE); (g) Rfc (Wzy); (h) PsbF (WbpF); (i) HisH; (j> 
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HisF; (k) PsbG (WbpG); (1) Psbl (Wbpl); ( m ) PsbJ (W bpJ); (n) PsbK (WbpK); (o, PsbM 
(WbpM); ( P ) PsbN (WbpN); ( q ) PsbH (WbpH); (r) PsbL (WbpL); (s) RpsA or <t> HimD. 

In an embodiment of the invention, an isolated nucleic acid molecule is 
provided having a sequence which encodes a protein having an amino acid sequence as 
5 shown in Figure 3 or SEQ.ID. No, 2; Figure 4 or SEQ.ID. No, 3; Figure 5 or SEQ ID No • 4 
Figure 6 or SEQ.ID. No, 5; Figure 7 or SEQ.ID. No, 6; Figure 8 or SEQ.ID. No, 7; Figure 9 or 
SEQ.ID. No, 8; Figure 10 or SEQ.ID. No, 9; Figure 11 or SEQ.ID. No, 10; Figure 12 or 
SEQ.ID. No.: 11; Figure 13 or SEQ.ID. No, 12; Figure 14 or SEQ.ID. No, 13; Figure IS or 
SEQ.ID. No, 14; Figure 16 or SEQ.ID. No, 15; Figure 17 or SEQ.ID. No, 16.; Figure 18 or 
10 SEQJD. No, 17; Figure 19 or SEQ.ID. No, 18; and Figure 20 or SEQ.ID. No, 19. 

Preferably, the purified and isolated nucleic acid molecule comprises 

(a) a nucleic acid sequence containing nucleotides 1-479; 1286-2596 
2670-3620; 3689-5578; 5575-6066; 6152-6982; 7236-8552; 8549-9499; 9831-10388; 10388-11,43 
11281-12411; 12427-13548; 13545-14633; 14651-15892; 15889-16851; 17935-19144; 19678-21675^ 

15 22302-23693; or 23704-24417, as shown in Figure 2 or SEQ. ID. NO, 1, wherein T can also be 

(b) a nucleic acid sequence containing two or more of nucleotides 1-479 
1286-2596; 2670-3620; 3689-5578; 5575-6066; 6152-6982; 7236-8552; 8549-9499; 9830-10388^ 
10388-1,143; 11281-12411; 12427-13548; 13545-14633; ,4651-15892; 15889-16851; 17935-19144^ 
19678-21675; 22302-23693; or 23704-24417. as shown in Figure 2 or SEQ. ID. NO, 1, wherein T 
can also be U; 

(c) nucleic acid sequences complementary to (a) or (b); 

(d) nucleic acid sequences which are homologous to (a), or <b); 

(e) a fragment of (a) to (d) that is at least 15 bases, preferably 20 to 30 
bases, and which will hybridize to (a) to (d) under stringent hybridation conditions; or 

(0 a nucleic acid molecule differing from any of the nucleic acids of (a) 
to (c) in codon sequences due to the degeneracy of the genetic code. 

Specific embodiments of the nucleic acid molecule of the invention 
include the following: 

1. An isolated nucleic acid molecule characterized by havinc a 
sequence encoding a Ro. (Wzz) protein of P. aeruginosa which regulates O-antigen Unking 
nucleic acid molecule preferably encodes Rol having the amino acid sequence as shown 
» F lg ure 3 or SEQ.ID. No, 2, and most preferably comprises nucleotides 1-479 as shown in 
»gure 2 or SEQ.ID. No, 1, or a nucleotide sequence as shown in Figure 42, which shows the 
lull length nucleotide sequence of the rol gene. 

2. An isolated nucleic acid molecule characterized by having a 
sequence encoding a PsbA (WbpA) protein of P. aer u gi nosa which has dehydrogenase 
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activity. The nucleic acid molecule preferably encodes PsbA having the amino acid sequence 
as shown in Figure 4 or SEQ.ID. No.: 3, and most preferably comprises nucleotides 1286-2596 
as shown in Figure 2 or SEQ.ID. No.: 1. 

3. An isolated nucleic acid molecule characterized by having a 
5 sequence encoding a PsbB (WbpB) protein of P. aeruginosa. The nucleic acid molecule 

preferably encodes PsbB having the amino acid sequence as shown in Figure 5 or SEQ.ID. 
No.: 4, and most preferably comprises nucleotides 2670-3620 as shown in Figure 2 or SEQ.ID. 
No.: 1. 

4. An isolated nucleic acid molecule characterized by having a 
10 sequence encoding a PsbC (WbpC) protein of P. aeruginosa which has acetyltransferase 

activity. The nucleic acid molecule preferably encodes PsbC having the amino acid sequence 
as shown in Figure 6 or SEQ.ID. No.: 5, and most preferably comprises nucleotides 3689-5578 
as shown in Figure 2 or SEQ.ID. No.: 1. 

5. An isolated nucleic acid molecule characterized by having a 
15 sequence encoding a PsbD (WbpD) protein of P. aeruginosa which has acetyltransferase 

activity. The nucleic acid molecule preferably encodes PsbD having the amino acid 
sequence as shown in Figure 7 or SEQ.ID. No.: 6, and most preferably comprises nucleotides 
5575-6066 as shown in Figure 2 or SEQ.ID. No.: 1. 

6. An isolated nucleic acid molecule characterized by having a 
20 sequence encoding a PsbE (WbpE) protein of P. aeruginosa. The nucleic acid molecule 

preferably encodes PsbE having the amino acid sequence as shown in Figure 8 or SEQ.ID. 
No.: 7, and most preferably comprises nucleotides 6152-6982 as shown in Figure 2 or SEQ.ID. 
No.: 1. 

7. An isolated nucleic acid molecule characterized by having a 
25 sequence encoding a Rfc (Wzy) protein of P. aeruginosa which has O-polymerase activity. 

The nucleic acid molecule preferably encodes Rfc having the amino acid sequence as shown 
in Figure 9 or SEQ.ID. No.: 8, and most preferably comprises nucleotides 7236-8552 as shown 
in Figure 2 or SEQ.ID. No.: 1. The nucleic acid molecule may comprise nucleotides 7236 to 
8552 where base 8059 is "G". The Rfc coding region has a lower mol.% G+C than the P. 

30 aeruginosa chromosomal average and it has similar amino acid composition and codon usage 
to that reported for other Rfc proteins. Using a novel gene-replacement vector, the present 
inventors were able to generate PAOl chromosomal rfc mutants. These knockout mutants 
express LPS containing complete core plus one O-repeat unit, indicating that they are no 
longer producing a functional O-polymerase enzyme. 

35 8. An isolated nucleic acid molecule characterized by having a 

sequence encoding a PsbF (WbpF) protein of P. aeruginosa. The nucleic acid molecule 
preferably encodes PsbF having the amino acid sequence as shown in Figure 10 or SEQ.ID. 
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£ >• ~ — Miy comprises „ ucteotides 8M9 ., 499 _ ^ ta figure 2 ^ jd 

« P«^b.y encodes PsbC hi, ^ ""^ "~ 

SEQ.ID. No.: 1. 2411 as « Figure 2 or 

The present inventors have inserted » • - 

which^^,,^^,^^^^ Sen.™™ casse«e ta ,„ , S „ G 

10 « q ue„ce encoding T * ' 

sequence as shown in Figure 14 or S EO m m , , 8 the amino acid 

encoding 1 . , p st T ( X I TlT ,efc m ° ,eCU ' e * « 

«codes PsbI havins „,. J£ ^ ~ ». nucleic acid „„, ecule prc , 

» — p— * cj P jrn^rr: * Rgure 15 or seqid n ° * - 

ides 13545-14633 as shown in Figure 2 or SEQ.ID. No ■ 1 
«quence encoding . P sb , ^ """f «*■— — I by having a 

' ISM - shown in Rgure a o^.D. No^ m ° S ' ~" ««*— M«- 

* Figure 17 or SEQ.ID. No ■ 16 and m . T * " *°>™ 

*ow„ in Figure 2 or ^ ,° D ; £~ ~ ~* -■eo.ides ,5^5, M 

sequence as shown in Figure 18 or SEO ihm 8 ' he am,no acid 

LPS. ^ 1 PsbM lockout mutants do not produce 



WO 97/41234 



-14- 



PCT/CA97/00295 



15 



15. An isolated nucleic acid molecule characterized by having a 
sequence encoding a PsbN (WbpN) protein of P. aeruginosa. The nucleic acid molecule 
preferably encodes PsbN having the amino acid sequence as shown in Figure 19 or SEQ.ID. 
No.: 18, and most preferably comprises nucleotides 22302-23693 as shown in Figure 2 or 

5 SEQ.ID. No.: 1. 

16. An isolated nucleic acid molecule characterized by having a 
sequence encoding a UvrB protein of P. aeruginosa which is involved in ultraviolet repair. 
The nucleic acid molecule preferably encodes UvrB having the amino acid sequence as 
shown in Figure 20 or SEQ.ID. No.: 19, and most preferably comprises nucleotides 23704- 

10 24417 as shown in Figure 2 or SEQ.ID. No.: 1 . 

17. An isolated nucleic acid molecule characterized by having a 
sequence encoding a RpsA protein for a 30S ribosomal subunit. The nucleic acid molecule 
preferably encodes RpsA having the amino acid sequence as shown in Figure 49. 

18. An isolated nucleic acid molecule characterized by having a 
sequence encoding a HimD protein for a host integration factor. The nucleic acid molecule 
preferably encodes HimD having the amino acid sequence as shown in Figure 50. 

In an embodiment of the invention, the nucleic acid molecule contains 
two genes from the gene cluster of the invention, preferably two genes which arc adjacent in 
the gene cluster. For example, the present inventors have found that rfc (wzy) and psbF 
(wbpF) are corranscribed and they are both required for B-band synthesis. If psbF (wbpF) is 
absent, both A and B synthesis are knocked out indicating that its gene product is required 
for expressor of A and B- band LPS onto the core oligosaccharide. Accordingly, the invention 
provides a nucleic acid molecule encoding a PsbF (WpbF) protein and an Rfc (Wzy) protein. 
Preferably a nucleic acid molecule comprising nucleotides 7239 to 9499 as shown in Figure 2 
25 or SEQ.ID. No.: 1. 

It will be appreciated that the invention includes nucleic acid 
molecules encoding truncations of the proteins of the invention, and analogs and homologs of 
the proteins of the invention and truncations thereof, as described below. It will further be 
appreciated that variant forms of the nucleic acid molecules of the invention which ar.se 
30 by alternative splicing of an mRNA corresponding to a cDNA of the invention are 

encompassed by the invention. 

Further, it will be appreciated that the invention includes nucleic acid 
molecules comprising nucleic acid sequences having substantial sequence homology with the 
nucleic acid sequences containing nucleotides 1-479; 1286-2596; 2670-3620; 3689-5578; 5575- 
35 6066- 6152-6982; 7236-8552; 8549-9499; 9831-10388; 10388-11143; 11281-12411; 12427-13548; 
13545-14633; 14651-15892; 15889-16851; 17935-19144; 19678-21675; 22302-23693; or 23704- 
24417 as shown in Figure 2 or SEQ. ID. NO, 2 and fragments thereof. The term "sequences 
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having substantia, sequence n mology" means those nuc.eic acid sequences which have 
sUght or .nconsequential sequence variations from these sequences, i.e. the sequences function 
» substantially the same manner to produce functionally equivalent proteins The 
vanauons may be attributable to local mutations or structural modifications. 

u NUdCiC ^ S6qUenCeS h " Ving SUbstantiaI ho <»«°ey delude nucleic 
acd sequences having at least 80-90%, preferably 90% identity with the nucleic acid 
sequence 1-479; 1286-2596; 2670-3620; 3689-5578; 5575-6066; 6152-6982; 7236-8552- 8549-9499- 
9831-10388; 10388-11143; 11281-12411; 12427-13548; 13545-14633; 14651-15892- ,58 8 7l^ 
17935-19144; 19678-21675; 22302-23693- or 23704-24417 « ch • « 15 *89-16851; 
in , n , '^° Ui ' i£Jei '' 3 ' orZ3704 - 24417 'asshowninFi g ure2orSEQ.ID.NO- 

IZZ" examp,e ' * is * at a sequence having 80% with" 

the DNA sequence encoding PsbM of the invention will provide a functional PsbM protein 

l»«n ♦ * , An °* er aSPeCt ° f ^ inVemi ° n Pr ° VideS 3 nuC,eic acid ^lecule, and 
*agments thereof having at least 15 base, which hybridizes to the nucleic acid molecules 

15 c ndirT° n hybridiZah ° n COnditi ° nS ' Preferab,y -""^ 

condmons^ Appropnate stringency conditions which promote DNA hybridization are 

;lTi° T S SkiUed fa ^ a " 0f ^ " fOUnd fa °~ ~ * Bio o^ 

oh. W.ey & Sons, N.Y. (1989), 6.3.1-6.3.6. For example, the following may be employed 

5 ;;t t l0ride/S ° diUm ** < SSC > - *~ «^ —d by a wl of 2.0 x SS C 

» eZple r T enCy ^ ^ baSCd ° n *" COndiUOnS USCd ^ * C ^P- *» 

example, the salt concentration in the wash step can be selected from a high stringency of 
about 0 2 x SSr *n*r i j j- • 5 stringency of 

0.2 SSC ,, 50 C. I„ .dd,Uoa the temperature in the wash S .ep can be at high 
stringency conditions, at about 65X. 

j.„ , 1SOhKd and pur " i,!d " ucleic **« molecules having sequences which 

«-^^- tei cacid^ U e„„ s h„v™i„SEQ IDN 0: 1 „ rRgurel ll;^ i ;^ 

,83,-10388; 10388-11143; 11261,2411; I2427 . 13548; 13545 . 146 3 3; ,4651,58,2 ,58^' 
™5,,144; 19678 , 1675; „ ^ ^ ^ 2 ~-- 

nu^c ac T^7." ^ "* " ""^ ^ - - »— 

^ ^ hmC " 0na " y P*— «M - ■ «M (WpbM, protein having 

dehydrogenase acnvity, b„, di„er in sequence lrom ^ P ^ 

degeneracy in the genetic code. 

DMA k • , ""'^ ^ m ° lecul « of ,Ke »hich comprises 

DMA can be .soUted by prepay . Ubelled „ uc , eic ^ probe ^ P ~ 

nucletc aad sequences cont.uung nucleotides ,.,79; 1286-25%; 2670-3620 3689-5578- 5575. 
066; 6 OT ; 7236-8552; 8549-94,,; ,831,038s; ,„3.8,„43; 1128,-124^124™ 
3545,4633; 14651,58,2; 158,0-1685, 17,35-1,144; ,0678-21675; B^^S 

2 «17. as shown in figure 2 or SEQ. ID. NO, 2, and using this labeMcd nucieie ac d P ZZ 
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screen an appropriate DNA library (e.g. a cDNA or genomic DNA library). For example, a 
whole genomic library isolated from a microorganism, such as a serotype of P. aeruginosa , 
can be used to isolate a DNA encoding a novel protein of the invention by screening the 
library with the labelled probe using standard techniques. Nucleic acids isolated by 
5 screening of a cDNA or genomic DNA library can be sequenced by standard techniques. 

An isolated nucleic acid molecule of the invention which is DNA can 
also be isolated by selectively amplifying a nucleic acid encoding a novel protein of the 
invention using the polymerase chain reaction (PCR) methods and cDNA or genomic DNA. 
It is possible to design synthetic oligonucleotide primers from the nucleic acid molecules 

10 containing the nucleotides 1-479; 1286-2596; 2670-3620; 3689-5578; 5575-6066; 6152-6982; 
7236-8552; 8549-9499; 9831-10388; 10388-11143; 11281-12411; 12427-13548; 13545-14633; 
14651-15892; 15889-16851; 17935-19144; 19678-21675; 22302-23693; or 23704-24417, as shown 
in Figure 2 or SEQ. ID. NO.: 2, for use in PCR. A nucleic acid can be amplified from cDN A or 
genomic DNA using these oligonucleotide primers and standard PCR amplification 

15 techniques. The nucleic acid so amplified can be cloned into an appropriate vector and 
characterized by DNA sequence analysis. It will be appreciated that cDNA may be 
prepared from mRNA, by isolating total cellular mRNA by a variety of techniques, for 
example, by using the guanidinium-thiocyanate extraction procedure of Chirgwin et al., 
Biochemistry, 18, 5294-5299 (1979). cDNA is then synthesized from the mRNA using 

20 reverse transcriptase (for example, Moloney MLV reverse transcriptase available from 
Gibco/BRL, Bethesda, MD, or AMV reverse transcriptase available from Seikagaku 
America, Inc., St. Petersburg, FL). 

An isolated nucleic acid molecule of the invention which is RNA can be 
isolated by cloning a cDNA encoding a novel protein of the invention into an appropriate 

25 vector which allows for transcription of the cDNA to produce an RNA molecule which 
encodes a novel protein of the invention. For example, a cDNA can be cloned downstream of 
a bacteriophage promoter, (e.g. a T7 promoter) in a vector, cDNA can be transcribed in vitro 
with T7 polymerase, and the resultant RNA can be isolated by standard techniques. 

A nucleic acid molecule of the invention may also be chemically 

30 synthesized using standard techniques. Various methods of chemically synthesizing 
polydeoxynucleotides are known, including solid-phase synthesis which, like peptide 
synthesis, has been fully automated in commercially available DNA synthesizers (See e.g., 
Itakura et al. U.S. Patent No. 4,598,049; Caruthers et al. U.S. Patent No. 4,458,066; and 
Itakura U.S. Patent Nos. 4,401,796 and 4,373,071). 

35 Determination of whether a particular nucleic acid molecule encodes a 

novel protein of the invention may be accomplished by expressing the cDNA in an 
appropriate host cell by standard techniques, and testing the activity of the protein using 
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be tested by mudng with „ wroprtate subsMte ^ for dehydrogerL activity 

5 qUM ' ^ " <Ud "»^«""" — termmatlon or MaxairHGilbert 

1 7:"™* '" d ~ * ^ - ■» P^«ed amino acid 

sequence of the encoded protein. 

The initiation codon and untranslated sequences of the nucleic acid 
aes-gned for the purpose, such as PC/Gene (IntelliGenetics Inc Calif) R e * ula , 
, ""' £, 7 d " — • *- - » a reporter gene whldl „ „„. 

inverted re, , ^ * """"^ *« ™y b. 

averted re,.,,vc ,„ .,s norma! presOTtalion for a(m [o J 

acd m„,ecu,e. Preferably, an „ lisen « „ MnstmcKd ™' C 

preceding the MUation ^ or a „ ^ / 

20 seoue^es conta^d in the nucleic molrai , es „, ^ f ^ 

~ y z or r* of * e nuctek • * - *«— X 

ID. NO. 1 and m F , gur e 2 (U. a nudeic acid module containing nucleotides ,.479. 12s! 
~0-3«0 ; tISM- * ^ 

21675. 22502-236,3; or 2^04-244.7, may be inverted relative ,o .heir norma. presentation 
for tnmscnpt.o,, ,„ produce Mbxnx ^ pre5en.at.on 

The antisense nucleic acid molecules of the invention „. , , 
mod fed nudges designed to increase the biologic., stabiHtv „ f me m..ecu tes ZZ 

t ."T s,abili,y 01 *" duplex fon " ed wllh mRIJA - - -C . 

may be reduced b,o,„ 8 icall y using an expression vector induced into cells in u,e <TT 
^mbinan, lasm,. phagcmid „ M _ Ka ^ fa ^ an „;^~- 

ml d ' C ° mr01 °" ^ rf ' idOTCy ' h < activitv o( I I 

may be deternuned by the cell type into which the vector ,s introduced. 
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The invention also provides nucleic acids encoding fusion proteins 
comprising a novel protein of the invention and a selected protein, or a selectable marker 
protein (see below). 

II. Novel Proteins of the Invention 

5 The invention further broadly contemplates an isolated protein 

characterized in that it has part or all of the primary structural conformation (ie. 
continuous sequence of amino acid residues) of a novel protein encoded by a gene of the psb 
gene cluster of the invention. In an embodiment of the invention, an isolated protein is 
provided which has the amino acid sequence as shown in Figure 3 or SEQ ID NO:2; (Rol or 
10 Wzz) # Figure 4 or SEQ ID NO:3 (PsbA or WbpA) Figure 5 or SEQ ID NO:4 (PsbB or WbpB); 
Figure 6 or SEQ ID NO:5 (PsbC or WbpC); Figure 7 or SEQ ID NO:6 (PsbD or WbpD); Figure 
8 or SEQ ID NO:7 (PsbE or WbpE); Figure 9 or SEQ ID NO:8 (Rfc or Wzy); Figure 10 or SEQ 
ID NO:9 (PsbF or WbpF); Figure 11 or SEQ ID NO:10 (HisH); Figure 12 or SEQ ID NO:ll 
(HisF); Figure 13 or SEQ ID NO:12 (PsbG or WbpG); Figure 14 or SEQ ID NO:13 (PsbH or 
15 WbpH); Figure 15 or SEQ ID NO:14 (Psbl or Wbpl); Figure 16 or SEQ ID NO:15 (PsbJ or 
WbpJ); Figure 17 or SEQ ID NO:16 (PsbK or WbpK); Figure 18 or SEQ ID NO:17 (PsbM or 
WbpM); Figure 19 or SEQ ID NO:18 (PsbN or WbpN); or Figure 20 or SEQ ID NO;19 (UvrB). 

The gene products of rol, psbA, psbB, psbC, psbD, psbE, rfc, psbF, hisH, 
hisF, psbG, psbH, psbl, psbl psbL, and psbK (also known as wzz, wbpA, wbpB, wbpC, 
20 wbpD, wbpE, wzy, wbpF, hisH, hisF, wbpG, wbpH, wpbl, wbpl respectively) are expected 
to be found in serotypes 02, OS, 016, 018, and O20, and the gene products of psbM and psbN 
(also known as wbpM and wbpN, respectively) are expected to be found in serotypes Ol to 
O20. The gene products of hisF and hisH are not found in serotype 06. 

Specific embodiments of the invention include the following: 
25 1. An isolated Rol (Wzz) protein of P. aeruginosa which regulates O- 

antigen linking, having the amino acid sequence as shown in Figure 3 or SEQ.ID. No.: 2. The 
function of Rol may be associated with the Rfc protein. 

2. An isolated PsbA (WbpA) protein of P. aeruginosa which has 
dehydrogenase activity, and the amino acid sequence as shown in Figure 4 or SEQ.ID. No.: 

30 3. PsbA may be involved in the biosynthesis of mannuronic acid residues. 

3. An isolated PsbB (WbpB) protein of P. aeruginosa having the amino 
acid sequence as shown in Figure 5 or SEQ.ID. No.: 4. PsbB may be involved in Fuc2NAc 
biosynthesis. 

4. An isolated PsbC (WbpC) protein of P. aeruginosa which has 
35 acetyl transferase activity and the amino acid sequence as shown in Figure 6 or SEQ.ID. No.: 

5. PsbC may be involved in the acetylation of mannuronic acid residues in the O-antigen. 
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5. An isolated PsbD (WbpD) protein of P. aeruginosa which has 
acetyltransferase activity and the amino acid sequence as shown in Figure 7 or SEQ.ID. No 
6. PsbD may be involved in the acetylation of mannuronic acid residues in the Oantigen. 

6. An isolated PsbE (WbpE) protein of P. aeruginosa, having the amino 
5 acid sequence as shown in Figure 8 or SEQ.ID. No.: 7. PsbE mav be involved in the 

biosynthesis of 2,3-, 2,4-, and 2,6-dideoxy sugars such as 2,3-dideoxy neuronic acid 
produced by P. aeruginosa OS. 

7. An isolated Rfc (Wzy) protein of P. aeruginosa which has O- 
polymerase activity and the amino acid sequence as shown in Figure 9 or SEQ.ID No • 8 

10 The Rfc protein is characterized as very hydrophobic, and it is an integral membrane 
protein with 11 putative membrane spanning domains. 

8. An isolated PsbF (WbpF) protein of P. aeruginosa, having the amino 
acid sequence as shown in Figure 10 or SEQ.ID. No.: 9. PsbF is translationally coupled with 
rfc and it is a putative flippase. 

15 9 - An isolated PsbG OVbpG) protein of P. aeruginosa which has the 

amino acid sequence as shown in Figure 13 or SEQ.ID. No.: 12. 

10. An isolated PsbH (WbpH) protein of P. aeruginosa which has 
ManA transferase activity and the amino acid sequence as shown in Figure 14 or SEQ.ID. 
No.: 13. PsbH may be involved in the addition of ManA (i.e. Man(2NAc3N)A) to the O- 

20 antigen unit 

11. An isolated Psbl (Wbpl) protein of P. aeruginosa which converts 
UDP-N-acetylglucosamine to UDP-N-acetylmannosamine, and has the amino acid 
sequence as shown in Figure 15 or SEQ.ID. No.: 14. 

12. An isolated PsbJ (WbpJ) protein of P. aeruginosa which has ManA 
transferase activity, and the amino acid sequence as shown in Figure 16 or SEQ.ID No • 15 
Based on their gene order and their relative hydropathic indices, the psbf and psbH gene 
products are thought to transfer Man(NAc)2A and Man(2Nac3N)A, respectively. 

13. An isolated PsbK (WbpK) protein of P. aeruginosa which has 
dehydratase activity, and the amino acid sequence as shown in Figure 17 or SEQ.ID. No, 



25 



30 16. 



35 



14. An isolated PsbM (WbpM) protein of P. aeruginosa having 
dehydrogenase activity, and the amino acid sequence as shown in Figure 18 or SEQ ID No • 
17. PsbM is involved in the biosynthesis of N-acetylfucosamine residues of the Oantigen. 
PsbM contains 2 NAD binding domains. 

15. An isolated PsbN (WbpN) protein of P. aeruginosa, having the 
amino acid sequence as shown in Figure 19 or SEQ.ID. No.: 18. 
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16. An UvrB protein of P. aeruginosa which is involved in ultraviolet 
repair and has the amino acid sequence as shown in Figure 20 or SEQ.ID. No.: 19. 

The m lecular weights, isoelectric points, and hydropathic indices of 
the Rol (Wzz), PsbA (WbpA), PsbB (WbpB), PsbC (WbpC), PsbD (WbpD), PsbE (WbpE), 
5 Rfc (Wzy), PsbF (WbpF), PsbG (WbpG), PsbH (WbpH), Psbl (Wbpl), PsbJ (WbpJ), PsbK 
(WbpK), PsbM (WbpM) and PsbN (WbpN) proteins are shown in Table 1. 

Within the context of the present invention, a protein of the invention 
may include various structural forms of the primary protein which retain biological 
activity. For example, a protein of the invention may be in the form of acidic or basic salts 
10 or in neutral form. In addition, individual amino acid residues may be modified by 
oxidation or reduction. 

In addition to the full length amino acid sequences (Figures 3 to 20 or 
SEQ. ID.NOS:2 to 19), the proteins of the present invention may also include truncations of 
the proteins, and analogs, and homologs of the proteins and truncations thereof as described 
15 herein. Truncated proteins may comprise peptides of at least fifteen amino acid residues. 

The proteins of the invention may also include analogs of the proteins 
having the amino acid sequences shown in Figures 3 to 20, or SEQ.ID. NOS: 2 to 19 and /or 
truncations thereof as described herein, which may include, but are not limited to an amino 
acid sequence containing one or more amino acid substitutions, insertions, and /or deletions. 
20 Amino acid substitutions may be of a conserved or non-conserved nature. Conserved amino 
acid substitutions involve replacing one or more amino acids of the proteins of the invention 
with amino acids of similar charge, size, and /or hydrophobicity characterisitics. When 
only conserved substitutions are made the resulting analog should be functionally 
equivalent. Non-conserved substitutions involve replacing one or more amino acids of the 
25 amino acid sequence with one or more amino acids which possess dissimilar charge, size, 
and /or hydrophobicity characteristics. 

One or more amino acid insertions may be introduced into the amino 
acid sequences shown in Figures 3 to 20, or SEQ.ID. NOS:2 to 19. Amino acid insertions may 
consist of single amino acid residues or sequential amino acids ranging from 2 to 15 amino 
30 acids in length. For example, amino acid insertions may be used to destroy target sequences 
so that the protein is no longer active. This procedure may be used in vivo to inhibit the 
activity of a protein of the invention. 

Deletions may consist of the removal of one or more amino acids, or 
discrete portions from the amino acid sequences shown in Figures 3 to 20 or SEQ.ID. NOS:2 to 
35 19. The deleted amino acids may or may not be contiguous. The lower limit length of the 
resulting analog with a deletion mutation is about 10 amino acids, preferably 100 amino 
acids. 
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fusion proteins. Additionally, immunogenic portions of a protein of the invention are 
within the scope of the invention. 

The proteins of the invention (including truncations, analogs, etc.) may 
be prepared using recombinant DNA methods. Accordingly, the nucleic acid molecules of 
the present invention having a sequence which encodes a protein of the invention may be 
incorporated in a known manner into an appropriate expression vector which ensures good 
expression of the protein. Possible expression vectors include but are not limited to cosmids, 
plasmids, or modified viruses (e.g. replication defective retroviruses, adenoviruses and 
adeno-associated viruses), so long as the vector is compatible with the host cell used. The 
expression vectors are "suitable for transformation of a host cell", means that the expression 
vectors contain a nucleic acid molecule of the invention and regulatory sequences selected on 
the basis of the host cells to be used for expression, which is operatively linked to the 
nucleic acid molecule. Operatively linked is intended to mean that the nucleic acid is 
linked to regulatory sequences in a manner which allows expression of the nucleic acid. 

The invention therefore contemplates a recombinant expression vector 
of the invention containing a nucleic acid molecule of the invention, or a fragment thereof, 
and the necessary regulatory sequences for the transcription and translation of the inserted 
protein-sequence. Suitable regulatory sequences may be derived from a variety of sources, 
including bacteria!, fungal, or viral genes (For example, see the regulatory sequences 
described in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic 
Press San Diego, CA (1990). Selection of appropriate regulatory sequences is dependent on 
the host cell chosen as discussed below, and may be readily accomplished by one of ordmary 
skill in the art. Examples of such regulatory sequences include: a transcriptional promoter 
and enhancer or RN A polymerase binding sequence, a riboscmal binding sequence, indudmg 
a translation initiation signal. Additionally, depending on the host cell chosen and the 
vector employed, other sequences, such as an origin of replication, additional DNA 
restriction sites, enhancers, and sequences conferring inducibility of transcription may be 
incorporated into the expression vector. It will also be appreciated that the necessary 
regulatory sequences may be supplied by the native protein and/or its flanking regions. 
30 The invention further provides a recombinant expression vector 

comprising a DNA nucleic acid molecule of the invention cloned into the expression vector in 
an antisense orientation. That is, the DNA molecule is operatively linked to a regulatory 
sequence in a manner which aUows for expression, by transcription of the DNA molecule, of 
an RNA molecule which is antisense to a nucleotide sequence comprising 1-479; 1293-2596; 
35 2670-3620; 3277-5577; 5574-6065; 6151-6981; 7235-8551; 8548-9498; 9830-10388; 10388-11143 
11281-12411- 12427-13548; 13545-14633; 14651-5892; 15889-16851; 18032-19141; 19678-21675 
22302-23693; or 23704-24417, as shown in Figure 2 or SEQ. ID. NO, 2. Regulatory sequences 
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Sambrook et al. (Molecular CI ning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor 
Laboratory press (1989)), and other laboratory textbooks. 

Suitable host cells include a wide variety of prokaryotic and 
eukaryotic host cells. For example, the proteins of the invention may be expressed in 
bacterial cells such as E. coli, insect cells (using baculovirus), yeast cells or mammalian 
cells. Other suitable host cells can be found in Goeddel, Gene Expression Technology: 
Methods in Enzymology 185, Academic Press, San Diego, CA (199 1). 

More particularly, bacterial host cells suitable for carrying out the 
present invention include E. coli, as well as many other bacterial species well known to one 
of ordinary skill in the art. Bacterial expression vectors preferably comprise a promoter 
which functions in the host cell, one or more selectable phenotypic markers, and a bacterial 
origin of replication. Representative promoters include the ^lactamase (penicillinase) and 
lactose promoter system (see Chang et al.. Nature 275:615, 1978), the rrp promoter (Nichols 
and Yanofsky, Meth in Enzymology 101:155, 1983) and the tae promoter (Russell et al.. Gene 
15 20: 231, 1982). Representative selectable markers include various antibiotic resistance 
markers such as the kanamycin or ampicillin resistance genes. Suitable expression vectors 
include but are not limited to bacteriophages such as lambda derivatives or plasmids such 
as P BR322 (see Bolivar et al.. Gene 2:9S, 1977), the pUC plasmids pUC18, pUC19, pUC118, 
pUC119 (see Messing, Meth in Enzymology 101:20-77, 1983 and Vieira and Messing. Gene 
20 19:259-268, 1982), and pNHSA, P NH16a, pNH18a, and Bluescript M13 (Stratagene, La 
Jolla, Calif.). 

Yeast and fungi host cells suitable for carrying out the present 
invention include, but are not limited to Saccharomyces ccrevisae, the genera Pichia or 
Kluyveromyces and various species of the genus Aspergillus. Examples of vectors for 
25 expression in yeast S. cerivisae include pYepSecl (Baldari. et al., (1987) Embo J. 6:229-234), 
pMFa (Kurjan and Herskowitz, (1982) Cell 30:933-943), P JRY88 (Schultz et al., (1987) Gene 
54:113-123), and pYES2 (Invitrogen Corporation, San Diego, CA). Protocols for the 
transformation of yeast and fungi are well known to those of ordinary skill in the art.(see 
Hinnen et al., PNAS USA 75:1929, 1978; Itoh et al.. J. Bacteriology 153:163, 1983, and Cullen 

30 et al. (Bio/Technology 5:369, 1987). 

The proteins of the invention may also be prepared by chemical 
synthesis using techniques well known in the chemistry of proteins such as solid phase 
synthesis (Merrifield, 1964, J. Am. Chem. Assoc. 85:2149-2154) or synthesis in homogenous 
solution (Houbenweyl, 1987, Methods of Organic Chemistry, ed. E. Wansch, Vol. 15 I and 
35 II, Thieme, Stuttgart). 
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III. Applications 
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detectable marker as described herein and the hybridization product may be assayed by 
detecting the detectable marker or the detectable change produced by the detectable 
marker. 

The nucleic acid molecule of the invention also permits the 
5 identification and isolation, or synthesis of nucleotide sequences which may be used as 
primers to amplify a nucleic acid molecule of the invention, for example in the polymerase 
chain reaction (PGR) which is discussed in more detail below. The primers may be used to 
amplify the genomic DNA of other bacterial species known to have LPS. The PCR 
amplified sequences can be examined to determine the relationship between the various 
10 LPS genes. 

The length and bases of the primers for use in the PCR are selected so 
that they will hybridize to different strands of the desired sequence and at relative 
positions along the sequence such that an extension product synthesized from one primer 
when it is separated from its template can serve as a template for extension of the other 
15 primer into a nucleic acid of defined length. 

Primers which may be used in the invention are oligonucleotides i.e. 
molecules containing two or more deoxy ribonucleotides of the nucleic acid molecule of the 
invention which occur naturally as in a purified restriction endonuclease digest or are 
produced synthetically using techniques known in the art such as for example 
20 phosphotriester and phosphodiester methods (See Good et al Nucl. Acid Res 4:2157, 1977) 
or automated techniques (See for example, Conolly, B .A. Nucleic Acids Res. 15:15(7): 3131, 
1987). The primers are capable of acting as a point of initiation of synthesis when placed 
under conditions which permit the synthesis of a primer extension product which is 
complementary to the DNA sequence of the invention i.e. in the presence of nucleotide 
25 substrates, an agent for polymerization such as DNA polymerase and at suitable 
temperature and pH. Preferably, the primers are sequences that do not form secondary 
structures by base pairing with other copies of the primer or sequences that form a hair pin 
configuration. The primer preferably contains between about 7 and 25 nucleotides. 

The primers may be labelled with detectable markers which allow for 
30 detection of the amplified products. Suitable detectable markers are radioactive markers 
such as P-32, S-35, 1-125, and H-3, luminescent markers such as chemiluminescent markers, 
preferably luminol, and fluorescent markers, preferably dansyl chloride, 
fluorcein-5-isothiocyanate, and 4-fluor-7-nitrobenz-2-axa-l,3 diazole, enzyme markers 
such as horseradish peroxidase, alkaline phosphatase, p-galactosidase, 
35 acetylcholinesterase, or biotin. 

It will be appreciated that the primers may contain 
non-complementary sequences provided that a sufficient amount of the primer contains a 
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ethidium bromide, under ultra violet (UW) light. DNA may be amplified to a desired 
level _and „a further extension reaction may be performed to incorporate* nucleotide 
derivatives having detectable markers such as radioactive labelled or biotin labelled 
nucleoside triphosphates. The primers may also be labelled with detectable markers as 
5 discussed above. The detectable markers may be analyzed by restriction and electrophoretic 
separation or other techniques known in the art. 

The conditions which may be employed in the methods of the 
invention using PCR are those which permit hybridization and amplification reactions to 
proceed in the presence of DNA in a sample and appropriate complementary hybridization 

10 primers. Conditions suitable for the polymerase chain reaction are generally known in the 
art. For example, see M.A. Innis and D.H. Gelfand, PCR Protocols, A guide to Methods and 
Applications M.A. Innis, D.H. Gelfand, J.J. Sninsky and T.J. White eds, pp3-12, Academic 
Press 1989, which is incorporated herein by reference. Preferably, the PCR utilizes 
polymerase obtained from the thermophilic bacterium Thermus aquatics (Taq polymerase, 

15 Gene Amp Kit, Perkin Elmer Cetus) or other thermostable polymerase may be used to 
amplify DNA template strands. 

It will be appreciated that other techniques such as the Ligase Chain 
Reaction (LCR) and NAS5A may be used to amplify a nucleic acid molecule of the invention 
(Barney in "PCR Methods and Applications", August 1991, Vol.l(l), page 5, and European 

20 Published Application No. 0320308, published June 14, 1989, and U.S. Serial NO. 5,130,238 
to Malek). 

A protein of the invention can be used to prepare antibodies specific for 
the protein. Antibodies can be prepared which bind a distinct epitope in an unconsented 
region of the protein. An unconsented region of the protein is one which does not have 

25 substantial sequence homology to other proteins. Alternatively, a region from a well- 
characterized domain can be used to prepare an antibody to a conserved region of a protein 
of the invention. Antibodies having specificity for a protein of the invention may also be 
raised from fusion proteins. 

Conventional methods can be used to prepare the antibodies. For 

30 example, by using a peptide of a protein of the invention, polyclonal antisera or monoclonal 
antibodies can be made using standard methods. A mammal, (e.g., a mouse, hamster, or 
rabbit) can be immunized with an immunogenic form of the peptide which elicits an 
antibody response in the mammal. Techniques for conferring immunogenicity on a peptide 
include conjugation to carriers or other techniques well known in the art. For example, the 

35 peptide can be administered in the presence of adjuvant. The progress of immunization can 
be monitored by detection of antibody titers in plasma or serum. Standard ELISA or other 
immunoassay procedures can be used with the immunogen as antigen to assess the levels of 
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antibodies. Fol.owing immunization, antisera can be obtained and. if desired, polyclonal 
antibodies isolated from the sera. 

To P r duoe monoclonal antibodies, antibody producing cells 
(lymphocytes) can be harvested from an immunized animal and fused with myeloma cells 
by standard somatic cell fusion procedures thus immortalizing these cells and yielding 
hybridoma cell, Such techniques are well known in the art. (e.g., the hybridoma technique 
ongmally developed by Kohler and Milstein (Nature 256. 495-497 (1975)) as well as other 
techruques such as the human B-cell hybridoma technique (Kozbor et al.. Immunol Today 
4 72 (1983,). the EBV-hybridoma technique to produce human monodonal antibodies (Coll 
et al. Monoclonal Antibodies in Cancer Therapy (1985) Allen R. Bliss. Inc., pages 77-96) 
and screemng of combinatorial antibody libraries (Huse et al.. Science 246, 1275 (1989)]' 
Hybndoma cells can be screened immunochemical! y for production of antibodies 
specially reactive with the peptide and the monoclonal antibodies can be isolated 
Therefore, the invention also contemplates hybridoma cells secreting monoclonal antibodies 
is with specificity for a protein of the invention. 

The term "antibody" as used herein is intended to include fragments 
hereof w hich also spccificaUy M ^ & ^ ^ ^ ^ ^ 

Antibodies can be fragmented using conventional techniques and the fragments screened for 
utihty in the same manner as described above. For example, F(ab')2 fragments can be 
generated by treating antibody with pepsin. The resulting F(ab )2 fragment can be treated 
to reduce disulfide bridges to produce Fab" fragments. 

Chimeric antibody derivatives, i.e., antibody molecules that combine 
a non-human animal variable region and a human constant region are also contemplated 
withrn the scope of the invention. Chimeric antibody molecules can include, for example 
the antigen binding domain from an antibody of a mouse, rat, or other species, with human 
constant regions. Conventional methods may be used to make chimeric antibodies containing 
the mununoglobulin variable region which recognizes the gene product of the genes of th! 
/^cluster of the invention (See, for example, Morrison et al., Proc. Natl Acad. Sci USA 

> 4 8 8l!L (1 r ); TakCda 61 N3tUre 314 ' 4M (1985) ' CaW1,y Ct *" US " ft—'li* 
4,81 ,567; Boss et a.., U.S. Paten, No. 4.816,397; Tanaguchi et al., European Patent 

21^7 EUr ° Pean PatCnt PUbliCati ° n ° 173494 ' GB 

Monoclonal or chimeric antibodies specifically reactive with a protein 
of the mvention as described herein can be further humanized by producing human constant 
region crumeras, in which parts of the variable regions, particularly the conserved 
ramework regions of the antigen-binding domain, are of human origin and only the 
hypervariable regions are of non-human origin. Such immunoglobulin molecules may be 
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made by techniques known in the art, (e.g., Teng et al., Proc. Natl. Acad. Sci. U.S.A., 80, 
7308-7312 (1983); Kozbor.et al., Immunology Today, 4, 7279 (1983); Olsson etaL, Meth. 
Enzymol., 92, 3-16 (1982)), and PCT Publication WO92/06193 or EP 0239400). Humanized 
antibodies can also be commercially produced (Scotgen Limited, 2 Holly Road, 
5 Twickenham, Middlesex, Great Britain.) 

Specific antibodies, or antibody fragments, reactive against proteins of 
the invention may also be generated by screening expression libraries encoding 
immunoglobulin genes, or portions thereof, expressed in bacteria with peptides produced 
from the nucleic acid molecules of the present invention. For example, complete Fab 

10 fragments, VH regions and FV regions can be expressed in bacteria using phage expression 
libraries (See for example Ward et al., Nature 341, 544-546: (1989); Huse et al., Science 246, 
1275-1281 (1989); and McCafferty et al. Nature 348, 552-554 (1990)). In an embodiment of 
the invention, antibodies that bind to an epitope of a protein of the invention are 
engineered using the procedures described in N. Tout and J. Lam (Ciinc. Diagn. Lab. Immunol. 

15 Vol. 4(2):147-155, 1997). 

The antibodies may be labelled with a detectable marker including 
various enzymes, fluorescent materials, luminescent materials and radioactive materials. 
Examples of suitable enzymes include horseradish peroxidase, biotm, alkaline 
phosphatase, Jl-galactosidase, or acetylcholinesterase; examples of suitable fluorescent 

20 materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, 
dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a 
luminescent material includes luminol; and examples of suitable radioactive material 
include S-35, Cu-64, Ga-67, Zr-89, Ru-97, Tc-99m, Rh-105, Pd-109, In-Ill, 1-123, 1-125, 1131, 
Re-186, Au-198, Au-199, Pb-203, At-211, Pb-212 and Bi-212. The antibodies may also be 

25 labelled or conjugated to one partner of a ligand binding pair. Representative examples 
include avidin-biotin and riboflavin-riboflavin binding protein. Methods for conjugating or 
labelling the antibodies discussed above with the representative labels set forth above 
may be readily accomplished using conventional techniques. 

The antibodies reactive against proteins of the invention (e.g. enzyme 

30 conjugates or labeled derivatives) may be used to detect a protein of the invention in various 
samples, for example they may be used in any known immunoassays which rely on the 
binding interaction between an antigenic determinant of a protein of the invention and the 
antibodies. Examples of such assays are radioimmunoassays, enzyme immunoassays (e.g. 
ELISA), immunofluorescence, immunoprecipitation, latex agglutination, hemagglutination, 

35 and histochemical tests. Thus, the antibodies may be used to identify or quantify th 
amount of a protein of the invention in a sample in order to diagnose P. aeruginosa 
infections. 
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A sample may be tested for the presence or absence of P. aeruginosa 
serotypes Ol to O20 by contacting the sample with an antibody specific for an epitope of 
PsbM (WbpM) or PsbN (WbpN) which antibody is capable of being detected after it 
becomes bound to PsbM (WbpM) or PsbN (WbpN) in the sample, and assaying for antibody 
5 bound to PsbM (WbpM) or PsbN (WbpN) in the sample, or unreacted antibody. A sample 
may also be tested for the presence or absence of P. aeruginosa serotypes 02, OS, 016, Ol8. 
and O20 by contacting the sample with an antibody specific for an epitope of a Rol PsbA 
PsbB, PsbC, PsbD. PsbE, Rfc, PsbF, PsbG, PsbH. Psbl, PsbJ, PsbK (also known as Wzz, Wbp A ' 
WbpB, Wb P C, WbpD, WbpE, Wzy, WbpF, WbpG, WbpH, Wbpl. WbpJ, WbpK 
10 respectively). HisH or HisF, protein which antibody is capable of being detected after it 
becomes bound to the protein in the sample, and assaying for antibody bound to protein in 
the sample, or unreacted antibody. 

In a method of the invention a predetermined amount of a sample or 
concentrated sample is mixed with antibody or labelled antibody. The amount of antibody 
15 used in the process is dependent upon the labelling agent chosen. The resulting protein bound 
to antibody or labelled antibody may be isolated by conventional isolation techniques, for 
example, salting out, chromatography, electrophoresis, gel filtration, fractionation 
absorption, polyacrylamide gel electrophoresis, agglutination, or combinations thereof. 

The sample or antibody may be insolubUized, for example, the sample 
20 or antibody can be reacted using known methods with a suitable carrier. Examples of 
suitable carriers are Sepharose or agarose beads. When an insolubifeed sample or antibody 
is used protein bound to antibody or unreacted antibody is isolated by washing. For example 
when the sample is blotted onto a nitrocellulose membrane, the antibody bound to a protein 
of the mvention is separated from the unreacted antibody by washing with a buffer for 
5 example, phosphate buffered saline (PBS) with bovine serum albumin (BSA). 

When labelled antibody is used, the presence of a P. aeruginosa 
serotype can be determined by measuring the amount of labelled antibody bound to a protein 
of the invention in the sample or of the unreacted labelled antibody. The appropriate 
method of measuring the labelled material is dependent upon the labelling agent. 
0 When unlabeled antibody is used in the method of the invention the 

presence of a P. aeruginosa serotype can be determined by measuring the amount of antibody 
bound to the P. aeruginosa serotype using substances that interact specifically with the 
antibody to cause agglutination or precipitation. In particular, labelled antibody against an 
antibody specific for a protein of the invention, can be added to the reaction mixture The 
presence of a P. aeruginosa serotype can be determined by a suitable method from among the 
already described techniques depending on the type of labelling agent. The antibody 
agamst an antibody specific for a protein of the invention can be prepared and labelled by 
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conventiona] procedures known in the art which have been described herein. The antibody 
against an antibody specific for a protein of the invention may be a species specific 
anti-immunoglobulin antibody- or monoclonal antibody, for example, goat anti-rabbit 
antibody may be used to detect rabbit antibody specific for a protein of the invention. 
5 The reagents suitable for applying the methods of the invention may 

be packaged into convenient kits providing the necessary materials, packaged into suitable 
containers. Such kits may include all the reagents required to detect a P. aeruginosa 
serotype in a sample by means of the methods described herein, and optionally suitable 
supports useful in performing the methods of the invention. 

10 In one embodiment of the invention the kit contains a nucleotide probe 

which hybridizes with a nucleic acid molecule of the invention, reagents required for 
hybridization of the nucleotide probe with the nucleic acid molecule, and directions for its 
use. In another embodiment of the invention the kit includes antibodies of the invention and 
reagents required for binding of the antibody to a protein specific for a P. aeruginosa 

15 serotype in a sample. In still another embodiment of the invention, the kit includes primers 
which are capable of amplifying a nucleic acid molecule of the invention or a 
predetermined oligonucleotide fragment thereof, all the reagents required to produce the 
amplified nucleic acid molecule or predetermined fragment thereof in the polymerase chain 
reaction, and means for assaying the amplified sequences. 

20 The methods and kits of the present invention have many practical 

applications. For example, the methods and kits of the present invention may be used to 
detect a P. aeruginosa serotype in any medical or veterinary sample suspected of containing 
P .aeruginosa. Samples which may be tested include bodily materials such as blood, urine, 
tissues and the like. Typically the sample is a clinical specimen from wound, burn and 

25 urinary tract infections. In addition to human samples, samples may be taken from 
mammals such as non-human primates, etc. Further, water and food samples and other 
environmental samples and industrial wastes may be tested. 

Before testing a sample in accordance with the methods described 
herein, the sample may be concentrated using techniques known in the art, such as 

30 centrifugation and filtration. For the hybridization and /or PCR-based methods described 
herein, nucleic acids may be extracted from cell extracts of the test sample using techniques 
known in the art. 

Substances that Affect O-antigen synthesis and assembly 

A protein of the invention may also be used to assay for a substance 
35 which affects O-antigen synthesis or assembly in P. aeruginosa Accordingly, the invention 
provides a method for assaying for a substance *±iat affects O-antigen synthesis or assembly 
in P. aeruginosa comprising mixing a protein of the invention with a test substance which is 
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suspected of affecting the expression or activity of the protein, and determining the effect of 
the substance by comparing to a control. 

In an embodiment f the invention the protein is an enzyme, and a 
method is provided for assaying for a substance that affects O-antigen synthesis and 
assembly in P. aeruginosa comprising incubating a protein of the invention with a substrate 
of the protein, and a test substance which is suspected of affecting the activity of the 
protein, and determining the effect of the substance by comparing to a control. 

In 3 P rcferred embodiment the protein is PsbM which has 
dehydrogenase activity. Representative substrates which may be used with PsbM in the 
assay are precursor sugars such as glucose. Dehydrogenase activity mav be assayed using 
conventional methods. ' 
Compositions and Methods of Treatment 

The substances identified by the methods described herein, antisense 
nucleic acid molecules, and antibodies, may be used for modulating one or both of O-antigen 
synthesis and assembly in P. aeruginosa and accordingly may be used in the treatment of 
uuectums caused by P.aen.ginosa. O-antigen is a virulence factor of P. aeruginosa and it is 
response for serum resistance. Therefore, substances which can target LPS biosynthesis in 
t0 the *to making "rough" LPS devoid of the long chain O- 

antigen (B-band) polymers will be useful in rendering the bacterium susceptible to attack by 
host defense mechanisms. The substances identified by the methods described herein 
antzsense nucelic acid molecules, and antibodies are preferably used to treat infections' 
caused by P. aeuginosa serotypes 02, 05, 16, 18 and 20. The substances etc. are also preferably 
used to treat infections caused by P. aeruginosa serotypes 03 or 06 which are predominant 
chnical isolates. ,t wil, be appreciated that the substances may also be useful to treat 
mfections caused by other members of the family Pseudomonadaceae (eg. P. cepacia and P 
pseudomallei), and to treat other bacteria which produce O-antigen. (e.g. other gram 

T7 SUCK " ^ ^ £nleriCa ' Vibrh entercoiitica and 

Shigella flezneri). 

The substances identified using the methods described herein may be 
formulated into pharmaceutical compositions for adminstration to subjects in a biologically 
compare form suitable for administration ,„ Vlv0 . B y "biologically compare form 
suitable for administration in vivo' is meant a form of the substance to be administered in 
vvluch any toxic effects are outweighed by the therapeutic effects. The substances may be 
administered to living organisms including humans, and an,mal, Administration of a 
therapeutically active amount of the pharmaceutical composites of the present invention 
» dehned as an amount effective, at dosages and for periods of time necessary to achieve 
the des.red result. For example, a therapeuucally active amount of a substance may vary 
— g to factors such as the disease state, age, sex, and weight of the individual, and 
the ab.hty of antibody to elicit a desired response in the individual. Dosage regima may 
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be adjusted to provide the optimum therapeutic response. For example, several divided 
doses may be administered, daily orthe dose may be proportionally reduced, ^indicated by 
the exigencies of the therapeutic situation. 

The active substance may be administered in a convenient manner such 
5 as by injection (subcutaneous, intravenous, etc.), oral administration, inhalation, 
transdermal application, or rectal administration. Depending on the route of 
administration, the active substance may be coated in a material to protect the compound 
from the action of enzymes, acids and other natural conditions which may inactivate the 
compound. 

10 The compositions described herein can be prepared by per se known 

methods for the preparation of pharmaceutically acceptable compositions which can be 
administered to subjects, such that an effective quantity of the active substance is combined 
in a mixture with a pharmaceutically acceptable vehicle. Suitable vehicles are described, 
for example, in Remington's Pharmaceutical Sciences (Remingtons Pharmaceutical 

15 Sciences, Mack Publishing Company, Easton, Pa., USA 1985). On this basis, the 
compositions include, albeit not exclusively, solutions of the substances in association with 
one or more pharmaceutically acceptable vehicles or diluents, and contained in buffered 
solutions with a suitable pH and iso-osmotic with the physiological fluids. 

The reagents suitable for applying the methods of the invention to 

20 identify substances that affect O-antigen synthesis and assembly in P. aeruginosa may be 
packaged into convenient kits providing the necessary materials packaged into suitabl 
containers. The kits may also include suitable supports useful in performing the methods of 
the invention. 

The utility of the substances, antibodies, and compositions of the 
25 invention may be confirmed in experimental model systems. 

The invention will be more fully understood by reference to the 
following examples. However, the examples are merely intended to illustrate embodiments 
of the invention and are not to be construed to limit the scope of the invention. 
EXAMPLES 

30 Materials and methods used in Examples 1 to 3 described herein 

include the following: 

Bacterial strains and culture conditions 

The bacterial strains used in this study are listed in Table 6. All 

bacterial strains were maintained on Tryptic Soy Agar (Difco Laboratories, Detroit, MI). P. 
35 Isolation Agar (PIA; Difco) was used for selection of transconjugants following mating 

experiments. Antibiotics used in selection media include: ampicillin at 100 M-g/nil for £. coli 

and carbenicillin at 450 Hg/mi for P. aeruginosa, tetracycline at 15 p.g/ml for E. coli and 90 
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DNA procedures 

Small-scale preparation of plasmid DNA w a<t h«„ • 
5 alkaline ,y sis method of Blmboim md » D "* — "'u—g *» 

DNA were abt.ln.rf ^ ^ Large-scale preparations o/ plasmid 

*- P. toUcing .ha memo. „ ( Goldbe , 8 and ^ 

enzymes were purchased from GIBCO/BRL and Bo-hri n Ke5t "«ion 

was transformed into £. coli and P aeruai^r.^ u , 
»i<w tmgmosa by electroporation using a Bio-Rad 

electroporation unit (Bio-Rad Laboratories RichmnnH * 

f,X meU,0dS ° f Bin0 " 0 - - - (1 '"> - and Kropin^ 

0«0). respec„ve,y. Kecomoinan, plasmlds we „ 

zr 7 h ,riparew *' maungs - *— * - a-JT 0 
iTz: m cT "tr ta £ sm, ° ,o p - - — °<™™ 

recommendations. Hybridized DNA was det 7^ 8 l<> ** ma " ufacture ' s 

conned to alKaiil +^2^71^^^ 
> ™ethoxy-4(3"-phos P hor y lox y) -phenyl-l 2-dioxetanl ^ T - Sp,r ° adamanta " e >- 4 - 

TnlOOO mutagenesis of pFV.TKfi 
DNA sequencing 

DNA sequence analysis of the 1.9 kb insert of nFV tv« 
b y the MOBIX facility (McMaster University, Hamilton ON ^ iC^^^TJ^ 
~* of PFV.TK8 was cloned into the sequencing vector pBluesl.pt 1 KS I 
sequenced using a model 373A DNA ~„ '"escript II KS and double-strand 

D , . f 373A ° NA unit (Applied Biosystems, Foster City CA> 

Ohgodeoxynucleotide primers for sequencing were svntK« a °ster c_ity, CA). 

T. q DyeDe„,y™ Terminator Cycle Sequencing Kit (Appll e d Biosystems, W as used ,„r cycle 
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sequencing reactions which were carried out in an Ericomp (San Diego, CA) model TCX15 
thermal cycler. 
Sequence Analysis 

The computer software programs Gene Runner for Windows (Hastings 
5 Software, New York, NY) and PCGENE (IntelliGenetics, Mountain View, CA) were used for 
nucleic acid sequence analysis, amino acid sequence analysis, and characterization of the 
predicted protein. DNA and protein database searches were performed using the NCBI 
BLAST network server (Altschul et al, 1990; Gish and States, 1993). 
Mutagenesis of the rfc gene of P. aeruginosa PAOl 

10 In order to construct P. aeruginosa rfc chromosomal mutants a novel gene 

replacement vector, pEXlOOT (Schweizer and Hoang , 1995) was used. This vector, called 
pEXlOOT, contains the sacB gene of B. subtilis which imparts sucrose sensitivity on gram- 
negative organisms and allows for positive selection of true mutants from the more 
frequently occurring merodiploids. In the first step of this experiment, the 5.6 kb Hindlll 

15 fragment of pFV.TK6 was blunt-ended using T4 DNA polymerase and subcloned into the 
Smal site of pEXlOOT. An 875 bp Gm R cassette from pUCGM (Schweizer, 1993) was then 
cloned into the single BamHl site of the insert DNA. The resulting plasmid, pFV.TK9, was 
transformed into the mobilizer strain £. coli SM10 and then conjugally transferred into 
PAOl (Simon et aL, 1983). After mating, cells were plated on PI A containing 300 Hg/ml of 

20 Gm. Colonies that grew on the Gm-containing medium were picked and streaked on PIA 
containing 300 ug/ml Gm and 5% sucrose to identify isolates that had lost the vector- 
associated sacB gene, and thus had become resistant to sucrose. Southern blot analysis was 
performed to verify that gene replacement had occurred (Figure 24). 
Preparation of LPS 

25 LPS used in sodium dodecyl sulfate-polyacrylamide gel 

electrophoresis (SDS-PAGE) and Western immunoblotting experiments was prepared 
according to the proteinase K digest method of Hitchcock and Brown (1983). 
SDS-PAGE 

The discontinuous SDS-PAGE procedure of Hancock and Carey (1979) 
30 utilizing 15% running gels was used. LPS separated by SDS-PAGE was visualized by silver- 
staining according to the method of Dubray and Bezard (1982). 
Immunoblotting 

The Western immunoblotting procedure of Burnette 981) was used 
with the following modifications. Nitrocellulose blots were blocked with 3% (w/v) skim 
35 milk followed by incubation with hybridoma culture supernatant containing either MAb 
MF15-4, specific for OS LPS, or MAb N1F10, specific for A-band LPS. The blots were 
developed at room temperature, using goat anti-mouse F(ab') 2 fragment conjugated antibody 
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(Jackson Immunoresearch Laboratories, West Grove. PA) and a substrate consisting of 30 mg 
of Nitro Blue Tetrazolium and 15 mg of S-bromo^<hloro-3-indo, y , phosphate toluidine 
(^gma, St Louis, MO) in 100 ml of 0.1 M bicarbonate buffer (pH 9.8). 

EXAMPLE 1 

5 Analysis of the LPS from mutants AK14D1 and rd7513. Strain AKMOl has been 
previously shown to contain A-band LPS; its B-band LPS consists of complete core plus one 
Orepeat unit (SR phenotype) (Berry and Kropinski. 1986; Lam et al.. 1992). Strain rd7513 
is a mutant of AK14Q1 that has the SR phenotype but is no longer producing A-band LPS 
due to a mutation in an A-band biosynthetic gene (Lightfoot and ^ 1991) . ^ ' 
10 was used in this study described in the examples, in addition to AK14Q1; but the majority of 
this investigation will focus on AK1401. 

Complementation of O-antigen expression in P. aeruginosa AK14D1. Mobilization of 
PFV100, which contains the OS rfb gene cluster, into SR mutant AK14G1 resulted in 
production of OS B-band LPS. These results suggest that an O-polymerase gene might be 
15 localized on the cloned DNA. Analysis of LPS isolated from PAOl and AK14Ol(pFVl00) 
in both silver-stained SDS-PAGE gels and Western immunoblots, reacted with OS-specific 
MAb MF15-4, revealed that the two strains expressed similar high molecular weight LPS 
profiles (Figure 22 a, b). In order to localize the putative rfc gene on the 26 kb insert of 
PFV100, various subclones were made (Figure 23) and used in complementation studies with 
AKMOl. Plasmid P FV.TK2, which contains a 16.5 kb Xbal fragment from pFVlOO was able 
to complement OS O-antigen production after mobilization into AKMOl (data not shown) 
Plasmids pFV.TK3, pFV.TK4, and pFV.TKS were generated and mobilized into AKMOl 
however none of the three plasmids was able to complement B-band synthesis in this' 
mutant. Subsequently, P FV.TK6 which contains a 5.6 kb H, M dIII insert was made and was 
» able to complement the SR phenotype of AKMOl (data not shown) 

Transpose Tn 1000 mutagenesis of P FV.TK6. Transpospn mutagenesis using Tnl 000 was 
performed in order to more precisely define the region of insert DNA in P FVTK6 
responsible for complementation of O-antigen expression in AK1401. pFV.TK^TnlOOO 
recombinants were mobilized into AK1401 and then screened for the lack of expression of O- 
antigen using OS-specific MAb MF15-4. Plasmid DNA was isolated from colonies that did 
not react with MAb MF15-4. and subjected to restriction enzyme analysis to determine the 
location of the TnlOOO insertion in P FV.TK6. Three TnWOO insertions in a 1 5 kb XHol 
fragment were found to interrupt O-antigen expression in AK1401 (Fig. 23). This 1.5 kb 
Xhol fragment was cloned into vector pUCP26 (pFV.TK7) and mobilized into AKMOl In 
Western immunoblots of LPS from AKMOl (pFV.TK7) with MAb MF15-4 no reaction of this 
antibody with high molecular weight B-band LPS could be detected (data not shown) 
Therefore, the 1.5 kb XHol insert in P FV.TK7 was unable to restore the O-polvmerase 
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function in AK1401. A 1.9 kb X/toI-HmdIII fragment was then subcloned into pUCP26 and 
the resulting plasmid was designated pFV.TK8 (Figure 23). Mobilization of this 
recombinant plasmid into both SR mutants, AK1401 and rd7513, resulted in restoration of 
Oantigen expression. Silver-stained SDS-PAGE gels and Western blots reacted with MAb 
5 MF15-4, showed that the AK1401(pFV.TK8) transconjugants expressed levels of OS B-band 
LPS comparable to that produced by the wild-type PAOl (Figure 22). 

Southern analysis using a 1.5 kb Xhol probe. The 1.5 kb Xhol insert of pFV.TK7, internal to 
the rfc coding region, was labelled with dUTP conjugated to digoxigenin and used to probe 
X/zoI-digested chromosomal DNA from the twenty P. aeruginosa serotypes. The probe 

10 hybridized to a 1.5 kb fragment in serotypes 02, OS, 016, 018 and O20 (data not shown), 
suggesting that these serotypes may share a similar O-polymerase gene. These hybrization 
results are not surprising in that serotypes 02, OS, 016, and O20 share a similar O-repeat 
backbone structure (Knirel, 1990). Although the Oantigen structure of serotype 018 has not 
yet been determined, it exhibits cross-reactivity with polyclonal antisera raised against 

15 serotype 05 (data not shown), suggesting that it has an O-repeat unit structure similar to 
that of OS. In a recent study, Collins and Hackett (1991) found that a probe generated from 
the rfc gene of S. enterica (typhimurium) cross-hybridized to chromosomal DNA of 
Salmonella groups A, B, and Dl strains but not with strains of groups D2 or E2, suggesting 
that the former may share a common rfc gene. In addition, studies done by Nurminen and 

20 coworkers (1971) have shown that the Opolymerase enzymes of Salmonella groups B and 
Dl strains are able to polymerize Orepeat units of either serotype. 

Generation of P. aeruginosa chromosomal r/c-mutants. In order to confirm that the insert 
DNA of pFV.TK8 codes for an O-polymerase gene, insertional mutagenesis was performed 
and the resulting plasmid used for homologous recombination with the PAOl chromosome. 

25 In the first step, the 5.6 kb insert of plasmid pFV.TK6 was cloned into a novel gene 
replacement vector, pEXlOOT, (Schweizer and Hoang, 1995). pEXlOOT is a pUC19-based 
plasmid that does not replicate in P. aeruginosa; therefore, maintenance of plasmid DNA 
can only occur after homologous recombination into the chromosome. The 5.6 kb insert of 
pFV.TK6 was used for gene replacement instead of the 1.9 kb insert of pFV.TK8 to ensure 

30 that there was sufficient DNA for homologous recombination. The next step involved 
insertion of an 875 bp Gm R cassette into a unique BamHl site in the insert DNA (Figure 24b). 
This step generated a mutation in the rfc gene and provided a means of later selecting for 
colonies that had undergone homologous recombination. Because the vector, pEXlOOT, 
contains the sacB gene of Bacillus subtilis it renders Gram-negative organisms sensitive to 

35 sucrose. Streaking Gm R recombinants on media containing 5% sucrose allowed separation of 
true recombinants from merodiploids, since merodiploids exhibit sucrose-sensitivity because 
of the presence of the vector-associated sacB gene. Of the eighty Gm R colonies that were 
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" Uted ' l™***" ^und to be sucrose-resistant. Three of the twenty-f ur isolates 
were randomly chosen for further characterization and were designated OP5.2, OPS 3 and 
OP5.5. Southern blot analysis f chromosomal DNA from these three putative mutants was 
performed m order to confirm that gene replacement had occurred. The IS kb Xhol fragment 
of pFV.TKS was used to probe Xfcol-digested chromosomal DNA isolated from the PAOl 
mid-type strain as well as OP5.2. OP5.3, and OP5.5. In stains that had undergone gene 
replacement. Xhol digestion should yield a probe-hybridizable fragment of 2.4 kb instead 
of 1.5 kb because of the insertion of the 875 bp Cm* cassette (Figure 24 a, b). Southern blot 
analysis of the three Cm* sucrose-resistant isolates revealed a probe-reactive fragment of 
2.4 kb (Figure 24 c, lanes 2-4); whereas, the probe reacted with a 1.5 kb fragment of the 
PAOl control DNA (Figure 24 c, lane 1), demonstrating that gene replacement had occurred 
in OP5.2, OP5.3, and OP5.5. Analysis of LPS from these three strains in silver-stained gels 
and Western immunoblots with OS-specific MAb MF1^4 demonstrated that they were not 
capable of producing long chain B-band O-antigen (Fig. 25a, b). Immunoblots reacted with 
A-band specific MAb N1F10 revealed that, like the SR mutant AK14Q1, these three 
mutants were still producing A-band LPS (Figure 25c). Biosynthesis of A-band LPS 
therefore, appears to be unaffected by this chromosomal mutation. The relative mobility of 
the core-hpid A bands was also similar to that of the SR mutant AK1401 (Figure 25a)- 
therefore the LPS phenotype of the three rfc knockout mutants was identical to that of 
AKHOl. Mobilization of pFV.TKS into OP5.2, OP5.3 and OP5.5 restored O-antigen 
expression in the three mutants (data not shown), indicating that the PAOl chromosomal 
modification was the result of a direct mutation of the rfc gene and not caused by a 
secondary mutation. 

Nucleotide sequence determination and analysis of rfc. The 1.9 kb Xhol-HindlU insert of 
PFV.TK8, containing the rfc coding region, was cloned into pBluescript and subjected to 
double-strand nucleotide sequence analysis. Examination of the nucleotide sequence (Figure 
9; GenBank accession number U17294) revealed one open reading frame (ORF) that coded for 
a protein of 438 amino acids, with a predicted mass of 48.9 kDal. This ORF was designated 

Analysis of the P. aeruginosa rfc mol. % G + C content (44.8%, Table 6) 
revealed that it is significantly lower than that of the rest of the genome (67.2%- 
PaUeroni,1984). A low G + C content is a common feature of reported rfc genes (Collins and 
Hackett, 1991; Brown et al., 1992; Klena and Schnaitman, 1993; Morona et a,., 1994) and has 
also been observed in all of the rfb clusters so far analyzed. The finding that the gene 
codmg for the O-polymerase enzyme and the genes encoding the O-antigen repeat units 
have a compatible G + C content is not surprising since the specificity of the enzyme must 
relate to the structure of it substrate. 
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Homology searches of both the nucleotide and the amino acid 
sequences of the P. aeruginosa rfc gene were performed using EMBL/GenBank/PDB and 
Swiss-PROT (release 28.0) databases (Altschul et aL, 1990; Gish and States, 1993). 
Comparison of the P. aeruginosa rfc sequences with sequences reported for other prokaryotic 

5 genes revealed no significant homology, including with those reported for other rfc genes. 
Previous studies on the structure of P. aeruginosa O-antigens have revealed that their sugar 
compositions differ significantly from most other enterobacterial Oantigens (Knirel et aL, 
1988). Neutral sugars, which are commonly found in enteric O-antigens, are only rarely 
found in O-antigens of P. aeruginosa. In addition, P. aeruginosa O-antigens are rich in amino 

10 sugars, many of which are substituted with acyl groups, a phenomenon rarely found in 
natural carbohydrates. Given the unique sugar composition of P. aeruginosa O-antigens, and 
the finding by Morona el al. (1994) that the S. flexneri Rfc protein showed no homology 
with other enteric Rfc proteins, it is not surprising that the P. aeruginosa Rfc protein 
exhibited no sequence homology with those of other enteric organisms. 

15 The P. aeruginosa rfc gene product does, however, have several 

features in common with other reported Rfc proteins, including the fact that it is very 
hydrophobic. The mean hydropathic index of the P. aeruginosa Rfc is 0.8 while those of 
other enteric organisms have been reported to range from 0.65 - 1.08 (Table 7). Examination 
of the hydropathy profile of this protein and analysis of the amino acid sequence, using the 

20 software program PCGENE, revealed that it is an integral membrane protein with 11 
putative membrane-spanning domains (Klein et aL, 1985). The Rfc proteins of S. enterica 
(typhimurium) and S. enterica (muenchen) are reported to have 11 membrane-spanning 
domains, while that of S. flexneri is reported to have 13 (Morona et aL, 1994); therefore, 
structural similarities appear to exist among the Rfc proteins of these four organisms. 

25 Codon usage and amino acid composition analysis. When the codon usage and amino acid 
composition of the P. aeruginosa Rfc protein was compared with that reported for S. 
enterica {typhimurium), S. enterica (muenchen), and Shigella flexneri Rfc proteins (Collins 
and Hackett, 1991; Brown et aL, 1992; Morona et aL, 1994), significant similarities were 
found between them (data not shown). Rfc proteins have been reported to contain a high 

30 content of three amino acids, namely, leucine, isoleucine, and phenylalanine (Morona et aL, 
1994). These three amino acids account for 27, 30, and 37 % of the total amino acids of the 
Rfc proteins of S. enterica (typhimurium), S. enterica (muenchen), and Shigella flexneri, 
respectively (Morona et aL, 1994). In the Rfc protein of P. aeruginosa, these amino acids 
represent 30% of the total amino acid composition. 

35 In summary, the present inventors have isolated an rfc gene in P. 

aeruginosa 05 encoding an O-polymerase enzyme. Using a gene-replacement system, P. 
aeruginosa rfc- chromosomal mutants were generated which expressed the typical sr lps 
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hydrophobic containing 1 , men^panning domains; the Rfc coding region has a lower 
mol. u G I + C than the P. aeruginosa chromosomal average; and it has a similar amino acid 
composition and codon usage to that reported for other Rfc proteins. 
5 EXAMPLE 2 

ZT:iz: " 05 ,PA01 ' E "" d '" 8 * — ■ °- 

fen *, , ^ P ' " rU *""*° """ > ' Pe 05 (PA °" "" *™ ta « uli "» »' C-ch.in 
of Rol was faceted by the generation of knockout mutants. 

fount. „ u DNAs "> ue "«°'*^l»«eo(pFV100.pFV161 (Figure 26). was 

fa-nd «. have homology to ft. s . n « from . number „, - ^ 

PPVW1. A c„sm.d library of p. n„ ugimsa (PA01) gmomic DNA ^ scm ^ 

*Tr : ,abe,ed probc *- pFvi61 ,o « — - (pFvl, 

conhunmg .he complete „, gene. Southern bio. analysis „, DNA from pFV400. digested 
w,.h a nun.be, of different restriction Byrnes, was performed. ^ pFvl61 S p „ De 
hyb„dr*d .0 an apprtaumatCy Z3 kb H ,„dn, frag™, o, pF V4X. Assuming fte ro, gene 
, . » ', -*■» «5 <™>.) was sim,,., in size (appro*. , kb> .„ ^ J*. 

ramily Enterobacteriaceae (Morona et al 1QQ«^ fkv < 

co„„.i„ .k . *' '"»"«« """U be sufficient .0 

** e " hre PU ' a ' ,VC - *™ « »> «-m '"8™.. was subckmed into ft. 

" SK (PDI Aurora - ^ °™ d » - ^> 

25 Nud ~ , ' de ^^ j "8»'^"kh«»dramser 1 w«perf< OTmd „ s i„ s 
dye terminator cyd. seo,,^ (GenA|yTiC ^ Un 6 

mass „, 39.3 kDA was .defied (Genoank accession .U50397,. H„m„,ogy searches using 

o .he puu..ve P. ro , _ showe „ , ppro _ Kly ^ ^ amtao 

bcween ft. p„ tetiv e Rol protem „ d ^ Ro , ^ ^ J 
.yphununum. OdKriM. off. and S* iit ,„„ ^x„ m (Moto „ a „ „ „ 95) (Tab|e S) 
. . . , To confirm that the insert DNA of pFV401 codes for a Ro, protein 

3 «- on., mutagenesis was performed and the resu,..„ 8 p,as mid construe, u L Z 
homo ogous recombination „ ilh ft e pA0 , Brie „ y ^ ^ ^ 

was cioned into , nove, g .ne.r. pl a«me„ t vector. pEX.OOT (Schwei.er and Ho^g I995) 
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that does not replicate in P. aeruginosa. pEXlOOT also contains the sacB gene of B. subtiUs 
which imparts sucrose sensitivity on Gram-negative organisms and allows for positive 
selection of true mutants from the more frequently occurring merodiploids. Next, an 875 bp 
gentamicin-resistance (GM R ) cassette from pUCGM (Schweizer, 1993) was inserted into a 
5 unique Xhol site in the insert DNA. The resulting plasmid (pFV401TG) was transformed 
into the mobilizer strain E. coli SM10 and then conjugally transferred into PA01 (Simon et 
aL, 1983). After mating, cells were plated on P. isolation agar (PIA; Difco Laboratories, 
Detroit Mich.) containing 300 ug ml" 1 gentamicin (Sigma Chemical Co., St. Louis, Mo.) and 
5% sucrose. This selective medium allows the identification of isolates that have 
10 undergone homologous recombination and lost the vector-associated sacB gene thus, 
becoming resistant to sucrose. Southern blot analysis with both wild-type rol gene and Gm R 
cassette probes was used to confirm the insertional mutation. The wild-type control and the 
mutants showed probe reactive fragments of 2.3 kb and 3.1 kb respectively (Fig. 27). 

The LPS of the mutants was prepared according to the proteinase K 
15 digest method of Hitchcock and Brown (1983). The LPS was analyzed using sodium dodecyl 
sulfate polyacrylamide gel electrophoresis (SDS-PAGE) and Western immunoblots 
according to the methods described previously (de Kievit et aL t 1995). When compared 
with the wild-type strain, the mutant LPS showed a marked alteration in the O-antigen 
ladder-like banding pattern, in which there was a decrease in high molecular weight 
20 bands and an increase in visible low molecular weight bands. This change corresponds to a 
loss of bimodal distribution in O-antigen length (Fig. 28). 

A T7 expression system (Tabor and Richardson, 1985) was used for 
expression of the Rol protein. A unique protein band with an apparent molecular mass of 39 
kDa was observed. This expressed polypeptide corresponded well to the predicted mass of 
25 39.3 kDa. This band was not observed in the vector-only control (Fig. 29). 

In conclusion, a rol gene was isolated in P. aeruginosa 05 (PA01) 
encoding a protein which regulates O-antigen chain length. Using a gene-replacement 
system, P. aeruginosa rol ::Gm R knockout mutants were generated which express LPS with 
unregulated O-antigen chain length. Thus, the P. aeruginosa 05 (PA01) Rol protein has both 
30 sequence and functional homology to other reported Rol proteins. This also confirms that 
the pathway for P. aeruginosa B-band LPS biosynthesis is Rfc-dependent The function of 
Rol is often associated with the Rfc protein, an O-polymerase (Whitfield, 1995, Kievit et 
aL, 1995). 

EXAMPLE 3 

35 Sequencing of the psb gene cluster. 

The isolation of a cosmid cione, pFVlOO, containing the psb gene cluster 
of P. aeruginosa OS identified in accordance with the present invention, was previously 
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described (Lightfoot and 1^ 1993). Several subclones of pFVlOO containing the psb genes 
were constructed. The sequencing and characterization f two of these cl nes <pFVlll and 
pFVllO), containing the rfc and P s b L {rfbA) genes respectively, has previously been 
descnbed (de Kievit et al., !995; Dasgupta and Lam. 1995). Sequencing of the remainder f 
the pFVlOO insert was undertaken in order to identify all the genes required for synthesis of 
the 05 O- antigen. 

Sequencing of the entire insert of pFVlOO, a total of 24416 bp, revealed 
a large number of open reading frames (ORFs) on both strands. ORFs which were reading in 
the same direction as rfc and psbL and which had homology either to any previously 
identified polysaccharide or antibiotic biosynthetic genes or to highly conserved bacterial 
genes were characterized further. A total of 21 ORFs which could be involved in synthesis 
of the OS O-anrigen were identified (Table 1). These genes were designated P sM through 
psM in the 5' to 3' direction, with the exceptions of roi and rfc, which were named 
according to convention. A further 4 ORFs with high homology to other bacterial genes or 
15 msertion sequences bu, which are not thought to be involved with LPS synthesis were 
identified {hisH, hisF, uvrB, IS407; Table 1). 

Distribution of the ,sb genes among the 20 sero^es of P. fl ^„ OSfl and loca „ Mtion of ^ 
OS-specific region. 

Southern blot analysis of the 20 serotypes of P. aeruginosa using 
vanous psb genes as probes revealed an interesting dichotomy. All of the probes tested 
which were 5' to the IS407 element hybridized only with chromosomal DNA from 
serotypes 02, OS, Ol6, 018 and O20 (Table 1). As stated above, these five serotypes have 
b^chemically and structurally similar O-antigens (Figure 1). Although the O-antigcns of 
serotypes 02, 05. 016, Ol8, and O20 are serologically distinct and have been shown to have 
clear biochemical differences, none of the P st genes tested hybridized only to serotype 05 
chromosomal DNA at high stringency. 

^ COmraSt Wtth **** findlngS ' Pr ° bes for DNA fences 3' to the 
IS407 element, and the IS407 element itself, hybridized with the chromosomal DNA from 
a» 20 serotypes of P. aeruginosa (Table 1). These results show that the insertion sequence is 
the junction between the portion of the psb cluster specific for OS and related serotypes 
heremafter referred to as the OS-specific region, or sometimes as the Group I genes) and 
the non-specific chromosomal DNA. Therefore, psbL appears to be the last gene of the OS- 
specfic region. Despite the fact that the DNA 3' of the insertion element is not 05- 
speahc, th,s region is thought to contain at least two ORFs ( psbM and P shN or sometimes 
re e rr ed to as the Group II genes) which may be involved in OS LPS biosynthesis (see 
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A 1.2 kb probe from the extreme 5' end of the insert of pFVlOO 
hybridized only to the five related serotypes, indicating that the 5' end of the OS-specific, 
region had not been cloned. This probe was used to isolate an overlapping cosmid, pFV400. 
Various subclones of pFV400 were constructed to localize the 5' end of the OS-specific region 
5 to within a 1.3 kb Sstl-Xhol fragment located 1.7 kb upstream of the 5' end of pFVlOO. 
Preliminary sequence analysis of this upstream region revealed no additional ORFs 
thought to be involved with LPS synthesis. Also, no insertion sequences could be found in 
this region of DNA. Localization of the 5' end of the OS-specific region to the 1.3 kb Sstl- 
Xhol fragment means the total amount of DNA which is specific to OS and related 
10 serotypes is approximately 20 kb. 

The composition and chromosomal milieu of the OS psb cluster. 

The %G+C of the P. aeruginosa chromosome has been determined by 
various methods to be approximately 65-67% (Pallcroni, 1984; West and Iglewski, 19XX). 
The %G+C content of the P. aeruginosa OS psb cluster within the OS-specific region 
IS averages 51.1% overall, with individual genes ranging from a low of 44.5% (psbG) to a 
high of 56.8% (psbK) (Table 1). These results are consistent with those seen for other rfb 
genes, averaging at least 10% below the chromosomal background, and this is thought to be 
reflective either of origin in a low %G+C background (Reeves, 1993) or of possible 
regulatory constraints (Collins and Hackctt, 1991; Morona et al., 1994a). The %G+C content 
20 of the psbM and psbN genes, which fall outside the OS-specific region, averages 62.6 %. 

Sequence analysis of pFV100/pFV400 revealed no homology to gnd 
(encoding 6-phosphogluconate dehydrogenase) in the regions flanking the LPS genes. 
However, P. aeruginosa has been shown to convert glucose-6-phosphate to 6- 
phosphogluconate as part of the Entner-Doudcroff pathway, suggesting a homoiogue of the 
25 gnd gene is located elsewhere on the chromosome. The location of the P. aeruginosa his 
operon is not known, but the few his auxotrophic lesions that have been mapped on the 
chromosome of serotype OS (strain PAOl) are several minutes from the A- and B-band LPS 
clusters (Lightfoot and Lam, 1993; Hollaway et al., 1994). Interestingly, two his genes 
(hisli and hisF) were found in the middle of the. psb cluster, within the OS-specific region 
30 (see below). Because these genes fail to hybridize with all twenty serotypes of P. 
aeruginosa at high stringency, it is likely they are not native P. his genes, but were acquired 
along with the psb genes in a horizontal transfer event. 

Homology searches of the Genbank databases with each of the ORFs 
in the psb cluster were performed. Assignment of putative function for the products of the 
35 ORFs was made based on homology of the encoded proteins to those previously described. 
Because the O-antigen of P. aeruginosa OS contains two similar 2,3-diacetaminido- 
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neuronic acid residues, it is anticipated that both residues share a comm n biosynthetic 
pathway. 

The 5' end of th pFVlOO insert contains a partial rol gene. 

The partial open reading frame at the 5' end of the insert of pFVlOO 
was found to have low homology at the amino acid level (34-37%) with the Rol proteins of 
EscHeruHia coli (Batchelor et al.. 1992; Bastin et al., ,993, Salvia enterica sv 

J^TZ!*^ 10 ' et "* 1992; Bastin et al - 19931 ^ Shigeiia W~ - 

al., 1994b). Only 479 bp of ^homologous DNA (encoding 159 amino acids) were present 
frorr .the XHol cloning site of pFVlOO. This sequence represented approximately the 3' half 
of the putative rol gene, based on the sizes of previously described rol gene, Using the 

PFV400, and ,,s functus confirmed by mutational analysis (Example 2). In other In- 
dependent LPS gene clusters, the rol gene is positioned near or a. the end of the cluster 
These results, along with the large number of ORFs already identified on pFVlOO suggested 
15 that most, if not all, of the genes required for OS O-antigen biosynthesis are present! this 

cosmid. 

psbA. 

^ " 8 distance of 807 b *»* between the rol gene and the first 
adjacent gene. psM. Although P. aeruginosa promoters are not well defined, there are 
20 smularities with £. coli promoters (Harley and Reynolds, 1987; Deretic et al., 1989) There 
« a possible o*> -like promoter sequence and a putative ribosoma. binding site (RBS) 
located 93 bp and 7 bp, respectively, upstream of the start of psM (Figure 31). PsbA has 
homology (summarized in Table 2) to EpsD, thought to be a dehydrogenase required f r 
synthesis of exopolysaccharide in Burner,* solanaceraeu m (Huang and Schel,, 1995)- 

i * synthesis of *• vi **** * s - entenca sv Typw * - 

993); and o Rf fD, a UDP-N-acetyl-D-mannosaminuronic acid dehydrogenase involved in 
synthesn* of Enterobacterial Common Antigen (ECA) in £. coli (Meier-Dieter et al 1992) 
ECA is an exopolysaccharide common to most enterics that can be linked to lipid A-core in 
rough strains. It is composed of N-acetyl-D-glucosamine (GlcNAc), N-cetyl-D- 
—aminuronic acid (ManNAcA), and 4-acetamido-4, 6 -dideo X y- D -g aIa ctose 

no. u •„ ^ alS ° ^ lm *** with CapL, involved in type 1 capsular 

polysacchande production in Staphylococcus aureus (Lin et al., ,994). The type 1 capsule is 
composed of taurine, 2-acetamido-2-deoxy-fucose (Fuc2NAc) and 2-acetamido-2-D- 
galacturonic acid (Gal2NAcA). The sugar composition of both ECA and type 1 capsule are 

o^ofT;- tt T osa 05 aanti8en - psbA ais ° has a ,ow ievei — - - 

ORF7 of the V, antxgen region of £. coli/Citrobacter freundii (accession #221706), and 
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several GDP-mannose and UDP-glucose dehydrogenases, including AlgD of P. aeruginosa 
(Deretic et al., 1987). AlgD is a GDP-mannose dehydrogenase required for alginate 
synthesis. These homologies suggest that PsbA functions as a dehydrogenase involved in 
the biosynthesis of the mannuronic acid residues, possibly converting UDP-N-acetyl-D- 
5 mannosamine into UDP-N-acetyl-D-mannosaminuronic acid. A large number of 
dehydrogenases including PsbA (as well as PsbK and PsbM, below) contain a consensus 
nicotinamide adenosine dinucleotide (NAD)-binding domain, thought to be important for 
activity (Figure 33). An alignment of the amino acid sequences of some PsbA-like proteins is 
shown in Figure 34. 
10 psbB. 

The psbB gene start is 74 bases from the termination codon of psbA, but 
no separate promoter sequence for psbB could be detected. A putative RBS is located 6 bp 
from the initiation codon for psbB and the second codon is AAA, the preferred second codon 
in £. coli (Gold and Stormo, 1987; Figure 32). The psbB gene product is possibly an oxido- 

15 reductase, dehydratase, or dehydrogenase. It is 28.2% homologous to the LmbZ protein of 
Streptomyces lincolnesis required for iincomycin production (Peschke et al., 1995), and also 
has homology with the pttrlO gene product of Streptomyces alboniger required for puromycin 
production (Tercero et al., 1996). PsbB has 17% homology to the BplA protein from B. 
pertussis required for LPS production (Allen and Maskell, 1996) and even weaker homology 

20 to ORF334 and MocA from Rhizobium meliloti found in the operon for rhizopine catabolism 
(Rossbach et al., 1994). In B. pertussis, the BplA protein is thought to catalyze the final 
step in the biosynthesis of UDP-diNAcManA from UDP-diNAcMan (Allen and Maskell, 
1996). 

Several of the psb genes were found to have high homology with bpl 
25 genes, suggesting a common ancestry. B. pertussis has semi-rough LPS, with only one O 
antigen unit attached to the core oligosaccharide. The composition of the B. pertussis O 
antigen unit is N-acetylglucosamine (GlcNAc), 2,3-dideoxy-2,3-N-acctylmannosaminuronic 
acid (2,3-diNAcManA), and N-acetyl-N-methyl fucosamine (FucNAcMe) (Allen and 
Maskell, 1996). These sugars are similar to those comprising ECA, S. aureus type 1 capsule, 
30 and the P. aeruginosa OS Oantigen. The amino acid homology between PsbB and BplA as 
well as the similarties in O-antigen unit composition suggest that PsbB could have a 
homologous function to that of BplA. Unlike the other putative dehydrogenases encoded in 
the psb cluster, PsbB does not contain a consensus NAD-binding domain. 
psbC 

35 The start of psbC overlaps significantly (343 bases) with the stop of 

psbB, and psbC could encode a large protein of 85.3 kDa (766 amino acids). Careful scrutiny 
of the DNA sequencing results confirmed no sequencing errors were present. Protein 
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expression wUl de,enni„e whe,h„ .his enure Urge ORF is B^w. The large size of 

T" "* " " SU " td * *■*» " ' — ■< P°«"«ai RBS 

upstream of the AUG c don of pstC (Figure 32). 

5 h™* , , ™ C C>rbOX >'- ten "' n " P""'"" »' M« has homology with a 

« protem <H10392> denved horn me H„„^„„ s .^^^ g^^ 

severnl hydrophobic domains. « though, ,„ be „ inregral ».„*,», pmcm ^ „ 
homology between PsbC and me macrolide J^acyUransferase g mc horn me 

St^toMyces «»,*«,„ carbomycu, biosynthetie cluster (Arisawa et ,1 1995, Psbc 
also has weak homology „ ilh E)coZ of melaw , mvolyed fc ^ ^ 

rjiLi *t t md wi,h Nodx °' * » — — 

1? iTfv " * " ni " 0 ^ alSO Wi,h •""'«<"> hydrophobic domains, 

r» 2 T amfa ° add protei " lh ° U8h ' ,o ■* ta - i - *• — 

15 r , u 8CTeS b, " h ■""*"" 3 0 -^'"»»"»«« A summary of the 
homologies between the above pro*™ is .hrnvn in Table 2. The similarities indioe PsbC 
partly the carboxy termmal ponion. may have SO-acyltransferas. activity, and 
could be involved in acetyUuon of the mannuronic acid residues in the OS O-antigen. 

P "' D »"* *Pf mn '» translationally coupled with the psbC 
20 8-.smce ifc s,.r t CK i< ».„ve r Upsmes,opcod„„„ / p sl ,c. A potential RBS is located 9 bp 
upstream of th. p s , D AUG codon (Figure 3 2). The product of the ps. D gaw „ m J 

C«- , and W* ,996, PsbD an d BplB appear to be O-acety, transferases. and have 

ZTb Tlrr l ^ B ~ "' 2, • """"" ™'~ lagoon e, a, 

Ir«tl UnUm , T 5i0 " * ra847) ' * '"^ «""■*— « " - plant 
Ar*,y, s IMam (BogdaJK>ya „ „ m5) (Table 2 K8ure ^ As wi(h ^ 

mvolved in m. acetyUtion o, me man.uronic acid residues comprising twothirds 
Ihe psf, homologues. psbB and ps k D respectively, are separated by the large psl,C gene. 

h.,c *n , PS6E has high homology with a If. pertussis LPS biosynthetie gene 
MC. P »D and P »E „ ad^t to one another in the p* Custe, as are M . „d in 
he p, duster ( A,.en and MasUU. „9o,. However, they do no, , ppe „ '„ ^ 

ps-E. Wh.,e .here » , po^,, RBS 9 bp before ,„e psi* s„r, (Figure 32, i, is no, Known 
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whether this gene can be transcribed from a promoter internal to the psbD gene. There are 
some sequences with weak homology to the £. coli consensus promoter sequence in tha t area . 

Also homologous to PsbE are DegT, from B. subtilis (Takagi et al., 
1990), Saccharopolyspora erythraea ErbS (ERYC1) involved in erythromycin synthesis 
5 (Dhillon et ah, 1989), DnrJ from Streptomyces peucetius required for daunorubicin 
biosynthesis (Stutzman et al., 1992) and SpsC from B. subtilis involved in spore coat 
polysaccharide biosynthesis (Glaser et al., 1993) (summarized in Table 2). There is also 
weak homology between PsbE and both MosB for rhizopine synthesis in R. meliloti 
(Murphy et al., 1993) and Yifl, a hypothetical protein in the rffE/rffT intragenic region of 

10 E. coli (Daniels et al., 1992). The proteins DegT/DnrJ/ERYCl/SpsC form a family of 
proteins formerly thought to form the DNA-binding component of sensory-transduction two- 
component regulatory systems. More recently, however, their function is suggested to be in 
the biosynthesis of 2,3-, 2,4-, and 2,6-dideoxy sugars such as the 2,3-dideoxy mannuronic 
acid produced by P. aeruginosa OS (Thorsen et al., 1993). An alignment of the amino acid 

15 sequences of the PsbE-like proteins is shown in Figure 36. 
The O-antigen polymerase, tfc. 

The rfc gene starts 254 bases downstream of the end of the psbE gene. 
This gene was cloned, sequenced and characterized as described in Example 1. Knockout 
mutations generated by insertion of a gentamicin cassette into rfc were used to confirm this 

20 gene encoded the O-antigen polymerase. Gentamicin-resistant mutants were shown to have 
the semi-rough phenotype (See Example 1) characteristic of an rfc mutant (Makela and 
Stocker, 1984). 
psbF. 

The psbF gene appears to be translationally coupled with the rfc gene 
25 since they have an overlapping stop and start. There is a RBS sequence 8 bp upstream of the 
initiation codon of psbF. It is most homologous to the ExoT protein of R. meliloti 
(Glucksmann et al., 1993), which is thought to be involved in succinoglycan transport. There 
is also a small amount of homology to FeuC of B. subtilis, part of its iron uptake system 
(Quirk et al., 1994). PsbF is the most hydrophobic protein encoded by the psb cluster (Table 
30 1) and has 9-10 membrane-spanning domains. This secondary structure is remniscent of that 
of RfbX, the putative flippase found in Rfc-dependent O-antigen clusters (Figure 37) 
(Schnaitman and Klena, 1993). Mutations in RfbX have been found to be unstable and 
deleterious to the host strain (Schnaitman and Klena, 1993). Recently Liu et al. (1996) 
confirmed that RfbX (Wzx) mutants accumulate one O-antigen unit on undecaprenol on the 
35 inside of the cytoplasmic membrane. PsbF knockout mutants generated by insertion of a 
gentamicin resistance cassette into psbF are both A and B-band minus (Figure 48). PsbF may 
be the P. aeruginosa OS equivalent of RfbX. 
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The hisH and hisF genes. 

The histidine operon. containing genes required for the biosynthesis of 
the amino acid histidine. has previously been shown to lie adjacent to the rfb clusters of 
several enteric species (reviewed in Schnaitman and Klena, 1993). Comparison of the 
5 chromosomal map locations of the P. aeruginosa OS A- and B-band LPs clusters with those 
of known PAOl his mutations showed there were no his genes located adjacent to either the 
psa (11-13 min) or psb (37 min) clusters (Lightfoot and Urn, 1993; Holloway et al 1994) 
Therefore, the identification of two genes with high homology to the genes hisF and hisH 
of vanous bacterial species in the middle of the psb cluster was unexpected. The hisH and 
10 h,sF genes are located between the psbF and psbC genes (Figure 1), and transcribed in the 
same direction. The direction of transcription of the his genes in previously characterized 
rfb clusters is opposite to that of the rfb genes (Ames and Hartman. 1974; Macpherson et al 
1994). 

While the deduced amino acid sequence of hisF appears to give a 
complete open reading frame (from bases 10387 to 11142), the sequence of hisH appears to be 
lacking an AUG initiation codon at the location predicted for the start of the protein based 
on amxno acid homology. However, there are potential starts at three GUG codons located 
51. 72. and 132 bp upstream of the first AUG, located at base 9830. The size of the protein 
corresponding to the product of hisH is approximately 21 kDa, indicating it is probably 
translated from either of these putative starts. Only the GUG codon at 9777 is preceded by a 
good RBS (Figure 32). none of the other potential start codons have consensus RBS sites N- 
terminal analysis of the HisH product will confirm the translations start. 

Protein expression analysis of this region shows the products of these 
genes are expressed in vitro in both orientations, indicating there is a promoter region 
preceding the his genes that can be recognized by L. coli. Analysis of the sequence upstream 
of the putative start sites of hisH shows there is a potential promoter sequence with 
parual homology to the £. coli consensus -35 and -10 regions (Figure 31). This homology is 

" ^ ^ ^ * P revi °"*»y reported P. aeruginosa promoter sequences that can 
function in £. coli (Deretic et al., 1989; Ronald et al., 1992). 

In K. pneumoniae, the products of the hisH and hisF genes have been 
shown to form a he.erodimeric enzyme complex required for the conversion of N'- 

I(5>hosphoribulosyl)-formiminoJ-5-aminoimidazole-4-carboxamide-ribonucleotide (5'- 
PRFAR) to imidazole glycerophosphate (IGP) and S'-phosphoribosyM-carboxam.de-S- 
aminoimidazole (ZMP) (Rieder et al., 1994). Although the products of the hisH and hisF 
genes have been shown to function together, the hisH and hisF genes themselves are 
separated by a third gene, hisA (Alifano et al., !996). The hisA and HisH genes are highly 
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related and are thought to have arisen through gene duplication. The gene order of 
hisHAF has been found in all bacterial species characterized to date (Alifano et al., 1996). 

Comparison of the amino acid sequence homologies of various HisF and 
HisH proteins (Tables 3 and 4) showed that the P. aeruginosa psb HisF and HisH proteins 
5 are not closely related to any of the HisF/HisH proteins characterized thus far. 
Comparisons of P. aeruginosa psb HisF with the other HisF proteins shown in Table 6 shows 
that it is the most distantly related protein of the group analyzed, at approximately 50% 
homology. 
psbG. 

10 There is a distance of 138 bp between hisF and psbC, and a putative 

promoter is identified in this region (Figure 31). A RBS is identified 4 bp from a putative 
GUG start and 7 bp from the adjacent AUG start codon (Figure 32). The optimum spacing of a 
RBS from the initiation site is 8 ±2 bp, suggesting the AUG codon is likely to be the start. 
PsbG has limited homology to ORF2 (11.2%) of Vibrio cholerae O-antigen (Comstock et al., 

15 1996), and less homology with NfrB of H. influenzae, a formate-dependent nitrate 
reductase (Fleischmann et al., 1993), and Pfk, a phosphofructokinase of the Gram positive 
bacterium, Lactococcus lactis (Xiao and Moore, 1993). Interestingly, the homology is 
associated with NfrB centres around the metal binding recognition site CXXCH, of which 
there are five in NfrB and one in PsbG (amino acids 24-28). 

20 Insertion of a gentamicin cassette into psbG results in B-band deficient 

mutants of PAOl, suggesting a role for it in O-antigen biosynthesis. 
psbH. 

There are 15 bp between psbG and psbH, however, no RBS can be 
detected upstream of the psbH start codon. The third codon is AAA (Figure 32). PsbH 

25 demonstrates low homology with CapM (14.2%) of S. aureus (Lin et al., 1994), involved in 
the synthesis of N-acetogalactosamino uronic acid. PsbH also has homology with a number 
of glycosyl transferases, including IcsA (17.1%) (accession #U39810) and RfaK (13%) 
(accession #U35713) of Neisseria meningitidis, RfbF (11.3%) of Klebsiella pneumoniae 
(Keenleyside and Whitfield, 1994). There is also a low level of homology with RfpB of 

30 Shigella dy sen teriae (Gohmann et al., 1994), and BplH and BplE of B. pertussis (Allen and 
Maskell, 1996). These enzymes are likely to belong to a family of transferases involved in 
the addition of a similar sugar to the growing O-antigen unit. 

RfpB, RfaK, and RfbF are glucosyl- or galactosyl transferases and it is 
likely that CapM is the transferase involved in the addition of N- 

35 acetogalactosaminouronic acid. This suggests that PsbH is one of the two ManA 
transferases. 
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PsbH also has very limited homol gy to the DnaK proteins of R. 
meliloti (Falah and Gupta, 1994) and Agrobacterium tumefaciens (Segal and Ron. 1995). 
However, the homology is concentrated around the central region of PsbH. DnaK is a 
chaperonin, and is thought to have a role in gene regulation. Homology around the 
5 functional domain of DnaK may suggest a role for psbH 7PsbH in regulation of the psb 
cluster. 
psbl. 

The start codon of psbl overlaps the stop codon of psbH. A putative 
RBS is situated 6 bp upstream of the AUG start and the second codon is AAA (Figure 32). 
10 Psbl demonstrates strong homology with BplD of B. pertussis (Allen and Maskell, 1996) 
(Table 2). BplD is purported to initiate the first step in the biosynthesis of 2.3- 
diNAcManA. Psbl also demonstrates moderate homology to NfrC and ORF o389 (RffD) of 
E. coli (Daniels et al., 1992), EpsC of Burkholderia solanacearum (Huang and Schell, 1995) 
YvyH of B. subtilis (Soldo et al, 1993) and RfbC of S. enterica sv Borreze (Keenleyside and 
15 Whitfield. 1995). EpsC is thought to be involved in the biosynthesis of N- 
acetylgalactosaminuronic acid, and RfbC is thought to be UDP-N-acetylglucosamine 2- 
epimerase. Alignment of Psbl and related proteins is shown in Figure 10. Based on these 
homologies, it is likely that Psbl converts UDP-N-acetylglucosamine to UDP-N- 
acetylmannosamine as the first step in the biosynthesis of mannuronic acid. Interestingly, 
the genes encoding the remaining enzymes in this pathway are located upstream and 
somewhat removed from the psbl gene (psbABDE). 
psbj. 

The distance between psbl and psb] is 17 bp. A putative RBS is present 
immediately following the stop codon of psbl, 13 bp from the AUG start codon of psb) (Figure 
4). Psbj demonstrates reasonable homology to BplE (52.6%) of B. pertussis, a glycosyl 
transferase thought to attach either 2,3-diNAcManA or FucNAcMe to the O-unit (Allen 
and Maskell, 1996) (Table 2). TrsE of Yersinia enterocolitica also has homology to Psbj 
(Skurnik et al., 1995), and is thought to be one of the galactosyl- or mannosyl transferases 
An alignment of Psbj and PsbJ-like proteins is shown in Figure 39. As BplE also has limited 
homology with PsbH, it is likely that both PsbH and Psbj are the transferases involved in 
the addition of the two mannuronic acid residues to the B-band O-antigen unit. Psbj has two 
putative membrane-spanning domains at the N-terminus, and may be anchored in the 
cytoplasmic membrane. 
psbK. 

The start codon of psbK overlaps the stop codon of psbj. and the second 
codon is AAA (Figure 32). PsbK demonstrates homology to a series of glucose dehydratases 
including StrP of Streptomyces glauciens involved in streptomycin biosynthesis (accession 
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nuinber 629223), ExoB of R. meliloti (Buendia et al., 1991), ORF o355 (incorrectly assigned 
RffE) of E. coli (Daniels et al, 1992, Macpherson et al., 1994), GraE of Streptomyces 
violaceoruben (Bechtold et al., 1995) and RfbB of a number of organisms including N. 
meningitidis (Hamerschmidt et al., 1994) and £. coli (Marolda and Valvano, 1995). 
5 Alignment of these proteins show the presence of an NAD-binding domain (GXXGXXG) near 
the N-terminal end (Figure 5; Macpherson et al., 1994). RfbB and o355 are known to be 
involved in the biosynthesis of FucNAc (Meier-Dieter et al., 1992). Based on these 
homologies, PsbK is thought to be dTDP-D-glucose 4,6-dehydratase, required as the second 
step in the biosynthesis of FucNAc. 
10 psbL. 

There are 59 bp between the end of psbK and the start of psbL but no 
RBS could be detected in the region preceding the double start cod on s (Figure 32. 
Identification of the psbUrfbA) gene has previously been reported (Dasgupta and Lam, 
1995). Further characterization of PsbL suggests it functions as a transferase, and is thought 

15 to initiate Oantigen unit biosynthesis with the addition of FucNAc to undecaprenol, based 
on its homology to Rfe. The alignment of PsbL with TrsF from V. cnterocolitica (Skurnik et 
al., 1995) and Rfe from E. coli (Daniels et al., 1992) is shown in Figure 40. Rfe is the initial 
transferase involved in the biosynthesis of ECA and some O-antigens (Schnaitman and 
Klena, 1993; Macpherson et al., 1994), transferring GlcNAc to undecaprenol (Meier-Dieter 

20 et al., 1992). Because the first transferase in the biosynthesis of O-antigen interacts with 
undecaprenol, it would be expected to be a hydrophobic protein. PsbL is the most 
hydrophobic (hydropathy index of 0.84, Table 1) of the three putative transferases encoded 
in the psb cluster (PsbH, PsbJ, PsbL). 
IS407 Pa . 

25 Following the psbL gene is an insertion sequence with 61.5% nucleotide 

identity with the previously characterized IS407 element of B. cepacia (Wood et al., 1991). 
This homology prompted the designation IS407 Pa , with the subscript Pa to indicate it is the 
P. aeruginosa version. Both elements are similar in size (1243 bp for 7S407 Bc and 1211 for 
/S407 Pa ) and have very similar imperfect inverted repeats (IR) of 12 and 11 bp respectively. 

30 The /S407 elements are similar to IS sequences from other soil-, water- and plant-associated 
bacteria, including ISR1 from R. meliloti (Priefer et al., 1989), IS52 7 from Caulobacter 
cresceniens, 152222 from Enterobactcr agglomerans, IS476 from Xanihamonas campestris 
(Kearney and Staskawicz, 1990), and IS9T7 from S. dysenteriae (Prere et al., 1990). There 
have been previous reports of IS elements in P. aeruginosa (Pritchard and Vasil, 1990; Sokol 

35 et al., 1994) but none of these have homology to the above group; therefore this is the first 
report of IS407in P. aeruginosa. Southern blot analysis using the lS407 Pa as a probe showed 
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it is present in all 20 serotypes of P. aeruginosa (Table 2), and most serotypes appear to have 

only a single copy f the element 

psbM. 

The psbM gene follows the lS407 Pa element and may be transcribed 
5 from one of three potential promoters present in the right IR (Figure 31). A gene-activating 
promoter was previously shown to be present in the right IR of lS407 tc (Wood et al., 1991). 
psbM is unusual because in contrast to other psb genes described above, it hybridiz s to 
chromosomal DNA from all 20 serotypes (Table 1). PsbM mutants, generated by insertion of 
a gentamicin cassette into a unique Nr„I site within psbM, exhibit B-band LPS-minus 
10 phenotype. This confirms the involvement of the psbM product in LPS biosynthesis, despite 
the fact it lies outside of the OS-specific region (Figure 41). PsbM has homology to a range 
of proteins involved in exopolysaccharide synthesis, including BpIL from the B. pertussis 
LPS cluster (Allen and Maskell, 1996). TrsG from the core biosynthetic cluster of Y 
enterocolitica 03 (Skurnik et al., 1995), and CapD from the S. aureus capsular gene cluster 
15 (Lin et al., 1994). These homologies are summarized in Table 2. 

As shown previously for BpIL, only the carboxy half of the PsbM 
protein has homology to GalE from several bacterial species, suggesting it may have 
ongmated as a fusion protein. In support of this hypothesis, PsbM also has homology to 
two adjacent ORFs (ORF10 and ORF11) in the LPS cluster of V. ckolerae 0139 (Comstocket 
20 al., 1996). The homology to ORF10 and ORF11 lies in the amino-terminal and carboxy- 
ternunal half of PsbM, respectively (Table 2), suggesting that two similar ORFs were fused 
dunng the evolution of PsbM and the BplL/TrsG/CapD group. 

Based on these homologies, PsbM is thought to be involved in the 
tnosynthesis of the N-acetylfucosamine residue of the 05 O-antigen. As mentioned above 
the O-antigen of B. pertussis and the type 1 capsule of S. aureus and the outer core of Y 
enterocolitica 03 all contain M-acetylfucosaminc. PsbM could function as a dehydrogenase 
and it contains two putative NAD-binding domains (Figure 33), as do BpIL and TrsG 
Agam, these duplications may have arisen from an ancestral fusion of two NAD-bindmg 
domain-containing proteins and may be Afunctional. 
30 psbN. 

The psbN gene has some homology to eryA, a gene involved in 
erythromycin biosynthesis in Sacchropolyspora erythrae. Generation of knockout 
mutations in psbN will demonstrate its function in biosynthesis of the OS O-antigen. 
uvrB. 



35 



The last partial open reading frame present on pFVlOO has high 
homology to the highly conserved uvrB gene from several bacterial species, including £ 
colt. S. enterica sv Typhimurium, and Micrococcus luteus. UvrB is a subunit of the UvrABC 
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DNA excision repair complex involved in removal of thymidine dimers induced by 
irradiation with ultraviolet light. The presence of uvrB adjacent to psbN confirms that 
psbN is the last gene in the psb cluster that could be involved in O-antigen biosynthesis. 
Organization of the psb gene cluster in P. aeruginosa OS. 
5 Several entire rfb clusters, particularly from enteric bacteria, have 

been characterized to date (reviewed in Whitfield and Valvano, 1993; and Schnaitman and 
Klena, 1993). In general, rfb clusters are located on the chromosome adjacent to the his 
operon and the gnd gene. Amongst the enterics, it has previously been shown that the rfb 
clusters are organized in a specific fashion (Reeves, 1993; Schnaitman and Klena, 1993). 

10 Genes necessary for sugar biosynthesis are arranged in discrete blocks located 5' to the 
transferases and other assembly genes {rfbX, rfc and rol). The psb cluster, however, appears 
to be almost randomly organised, with genes thought to be involved in the biosynthesis of 
Man(2NAc3N)A and Man(2NAc3NAc)A scattered throughout the gene cluster (psbl, psbE, 
psbD, psbB and psbC). The genes thought to encode for the biosynthesis of FucNAc are also 

15 scattered throughout the cluster (psbK, psbM, psbG, psbN). Further, the genes encoding 
transferases are interspersed throughout the psb cluster (psbH, psb}, psbL), and are 
separated from one another by one gene each. However, the transferase genes do appear to 
be organized such that the gene encoding the putative first transferase (PsbL), thought to 
initiate O-antigen assembly on undecaprenol, is the most distal. Recent results from 

20 detailed spectroscopic analysis, using high resolution NMR and Mass Spectroscopy of an rfc 
mutant of PAOl, strain AK1401, show that FucNAc is the first sugar of the O-antigen unit, 
attached to the core oligosaccharide. PsbL's homology to Rfe, and its hydropathicity 
support the interpretation that it is the first transferase, and is responsible for attachment 
of the FucNAc residue to undecaprenol. Therefore, based on their gene order and their 

25 relative hydropathic indices (-0.21 and 0.10), the psb} and psbH gene products are thought 
to transfer Man(NAc) 2 A and Man(2NAc3N)A, respectively. 

The O-antigen of P. aeruginosa 05 is an Rfc-dependent heteropolymer. 

The psb cluster was shown to contain an rfc gene, (See Example 1) the 
interruption of which (by knockout mutation and gene replacement) resulted in a SR 

30 phenotype (de Kievit et al., 1995). At least two other gene products, Rol and RfbX, are 
thought to be involved in Rfc-dependent synthesis of heteropolymeric O-antigens 
(Whitfield, 1994). Here a rol gene has been identified in the psb cluster. However, in the 
analysis of the psb genes, no r/bX-like gene was identified. The psbF gene product appeared 
to be the most likely candidate, based on its hydropathy profile (Figure 9), but insertional 

35 mutants of psbF do not have the phenotype expected of rfbX mutants. 
Identification of his genes within the psb gene cluster. 
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mutant, strain AK1380, was isolated which was identified as serotype 016 (see Lam et al., 
1992, Fig.30; and Kuzio and Kropinski, 1993). 

The genetic differences among the five serotypes with related O 
antigens are obviously quite minor. Comparison of the DNA sequences of the 02 r/c and the 
5 OS r/c genes revealed they are very homologous at the nucleotide level). 

EXAMPLE 4 

Further Characterization of Rol (Wzz) Gene and Region Upstream 

In this example the rol gene is generally referred to as the wzz gene. 
The materials and methods used in Example 4 are as follows: 

10 Bacterial strains and plasmids. 

The bacterial strains and plasmids used in this study are listed in 
Table 8. P. aeruginosa strains were cultured either on Luria broth or plates or on 
Pseudomonas Isolation Agar (PIA: Difco, Detroit, MI). E. coli strains were cultured on Luria 
broth or plates. Media were supplemented with antibiotics ampicillin, carbenicillin, 

15 tetracycline, or gentamicin (all from Sigma, St. Louis, MO) as required, using the 
concentrations outlined in de Kievit et al., 1995. 
DNA methods. 

Chromosomal DNA was isolated from P. aeruginosa using the method 
of Goldberg and Ohman, 1984. Plasmid and cosmid DNA was isolated using the Qiagen 
20 midi-prep kit (Qiagen Inc., Chatsworth, CA) as directed by the manufacturer. Restriction 
and modification enzymes were supplied by Gibco/BRL (Gaithersburg, MD), Boehringer 
Mannheim (Laval, PQ), and/or New England Biolabs (Beverly, MA) and were used as 
directed by the manufacturers. 

Plasmids were introduced into £. coli by CaCl 2 transformation (Huff et 

25 al., 1990) and into P. aeruginosa by electroporation using a BioRad (Richmond, CA) Gene 
Pulser apparatus following manufacturers protocols. P. aeruginosa electrocompetent cells 
were prepared by washing early log phase cells twice for 5 min each in sterile 15% 
room-temperature glycerol followed by immediate resuspension in the same solution. Cells 
were either used immediately or frozen at -80°C for future use. Alternatively, plasmids 

30 were mobilized into P. aeruginosa through biparental mating with £. coli SM10 carrying 
plasmids of interest (Simon et al., 1983). 
Construction of plasmids. 

The cosmid pFVlOO, containing the P. aeruginosa wbp cluster, was used 
as a source of DNA for the construction of pFV161 (Fig. 43). An overlapping cosmid, 

35 pFV400, was the source of a 2.3-kb Hindlll fragment cloned into pBluescript II SK (pFV401). 
For DNA sequencing, a 0.8 kb Hindlll-Xhol fragment from pFV401 was subcloned into 
pBluescript II SK (pFV402). A 3.0 kb Sstl fragment containing the 5 portion of wzz and 
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upstream sequenc s was cloned from P FV400 into pBluescrip, II SK ( P FV403) For 
complementation experiment, the 2.3 leb insert of P PV40I was cloned into the 
Ps^onas-E. coU shuttle vector p UC P2 6 (Table 14) , downstream of the vectlt i cZ 
promoter (pFV401-26). vectors lacZ 

5 DNA sequencing and analysis. 
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conjugate (Jackson Laboratories, Bio/Can Scientific, Mississauga, ON). The blots were 
developed using a substrate containing 0.3 mg/ml NBT (Nitro Blue Tetrazolium) and 0.15 
mg/ml BCIP (5-bromo-4-chloro-3-indolyl phosphate toluidine) (Sigma) in 0.1 M 
bicarbonate buffer (pH 9.8). 

5 Creation of wzz knockout mutants through gene replacement. 

The gene replacement strategy of Schweitzer and Hoang, 1985 was used 
for generation of knockout mutations in wzz. The 2,3 kb Hmdlll insert of pFV401 was cloned 
into pEXlOOT, a pUC19-based vector containing the sacB gene as a selectable marker 
(pFV401T). An 875 bp gentamicin resistance cassette from the plasmid pUCGM was then 

10 cloned into the unique X/ioI site within the insert (pFV401TGm). Constructs containing the 
interrupted wzz gene were mobilized into P. aeruginosa OS by biparental mating with £. 
coli SM10. Since pEXlOOT does not replicate in P. aeruginosa, selection for gentamicin 
resistance allows detection of chromosomally-integrated copies of the mutated gene. 
Determination of sucrose and carbenicillin (Cb) sensitivities distinguishes between 

15 merodiploids (sucrose 5 , Cb R ) and true recombinants (sucrose R , Cb s ). The presence of the 
gentamicin cassette in the chromosomal DNA of P. aeruginosa OS and 016 wzz mutants was 
confirmed by Southern blot analysis (not shown). 
RESULTS 

Cloning and sequencing of the P. aeruginosa 05 wzz gene. 

20 Nucleotide sequences with homology to wzz from E. coli, Salmonella 

enterica sv Typhimurium and Shigella flexneri (Bastin et al., 1993; Batchelor et al., 1992; 
Morona et al., 1995) were identified ending approximately 800 bp upstream of the first gene 
of the P. aeruginosa OS wbp gene cluster, wbpA (Fig. 43). The amount of DNA with 
homology to wzz was 479 bp, starting at the Xhol cloning site of the insert of pFVlOO and 

25 ending with a stop codon. Based on the average size (1 kb) of previously characterized wzz 
genes (Bastin et al., 1993; Batchelor et al., 1992; Morona et al., 1995), this sequence 
represented approximately half of the putative P. aeruginosa wzz gene. 

A 1.5 kb X/iol-Hindlll fragment from pFV161 containing the 3 end of 
the putative wzz gene (Fig. 43) was used as a probe to screen a P. aeruginosa OS cosmid 

30 library. One cosmid (pFV400) which hybridized with the probe was isolated. A 
probe-reactive 2.3 kb Hmdlll fragment from pFV400 was subcloned into pBluescript II SK to 

form pFV401 (Fig. 43). 

DNA sequence analysis revealed an open reading frame (ORF) of 1046 
base pairs (bp), sufficient to encode a protein of 348 amino acids with a molecular mass of 
35 39.3 kilodaltons (kDa), and an isoelectric point of 6.26. Comparison of the deduced amino 
acid sequence of the P. aeruginosa OS protein with those in GenBank revealed from 11.5 to 
20.0% amino acid identity with Wzz-like proteins of other species (Table 15). P. 
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with partial homology to the E. colt 6 70 consensus. The -10 regions of these putative 
promoters are located approximately 60, 140, or 155 bp upstream of the wzz initiation cod n. 
Analysis of the putativ Wzz protein function using chromosomal kn ckout mutants. 

A gentamicin-resistance (Gm R ) cassette was inserted into the putative 
5 wzz gene of P. aeruginosa OS, and the interrupted gene was reintroduced into the OS 
chromosome by homologous recombination. Comparison of LPS from the wild-type strain 
and the Gm R mutant on silver-stained SDS-PAGE gels and Western immunoblots using 
B-band-specific MAbs MF15-4 and 18-19 showed that the mutant had an altered LPS 
banding pattern. When MAb 18-19 was used, the LPS from the wzz mutant showed an 

10 increase in both shorter and longer B-band LPS O chains and a decrease in B-band O chains 
whose length corresponded to that preferred in the OS parent strain (Fig. 46). On the 
immunoblot using MAb MF15-4, which is specific for high-molecular-weight LPS (Lam et 
al., 1992), there is also an increase in both shorter and longer B-band O chains. Similar 
Western immunoblots using the A-band LPS-specific MAb N1F10 showed the modality of 

15 A-band was unaffected by the wzz mutation (not shown). Although the B-band LPS pattern 
of the wzz mutant is significantly different from the parent strain, it does not show the 
linear distribution of O-antigen chain lengths seen in enteric wzz mutants (Fig. 47A). 
Reproduction of the OS wzz gene on pFV401-26 restored the mutant to a phenotype similar 
to that of the parent but missing both the shortest and longest groups of chain lengths (Fig. 

20 46). 

Comparison of the function of wzz in two related serotypes of P. aeruginosa. 

A DNA probe containing the 05 wzz gene hybridized with 
chromosomal DNA only from serotypes 02, OS, 016, 018, and O20 of P. aeruginosa, all of 
which have chemically- and structurally-related O antigens (Example 3). The O antigens 
25 of both OS and 016 are composed of two mannuronic acid and one N-acetyl fucosamine 
residues, but differ in one glycosidic linkage. In OS, the linkage is (l(3)-(-D-Fuc2NAc, 
while in 016, the linkage is (l(3)-(-D-Fuc2NAc. This change results in a discernible 
difference in the LPS patterns of OS and 016 (Fig. 46). 

Taking advantage of the similarity between the O-antigen gene 
30 clusters of OS and 016, a wzz knockout mutation was introduced into 016, using the OS wzz 
knockout construct. As an additional benefit, 016 does not express A-band LPS (Lam et al., 
1989), thus any changes in B-band LPS patterns on silver-stained gels were more easily 
visualized. The structural difference between 05 and 016 LPS is detected by MAb MF15-4, 
which recognizes only 05 and not 016 LPS. To examine LPS from both 05 and 016 
35 simultaneously on Western immunoblots, MAb 18-19, which cross-reacts with all five 
serotypes in the 05 serogroup (Lam et al., 1992), was used. Comparison of LPS from the 
wild-type 016 parent and the 016 wzz knockout mutant showed the mutant displayed a 
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15 



20 



25 



35 



loss of modality corresponding to the preferred chain lengths of the parent, and an increase 
«i higher-molecular-weight LPS (Fig. 46). Interesting*, there still appeared to be chain 
length modulation in the 016 wzz mutant that was different from that of the parent with 

a decrease in short O chains in comparison to the 05 wzz mutant. Bastin and coworkers 
5 (1996) showed that the modality of chain length distribution was dependent on the source 

of the wzz gene. However, the pattern of LPS chain length distribution of 016 wzz mutants 

carrying the OS wzz gene on pFV401-26 resembled that of the Ol6 parent strain, rather 

than the 05 strain (Fig. 46). 

Ability of the P. aeruginosa OS wzz gene to function in £. coli. 
10 In ° rder to det *"«ine whether wzz from P. aeruginosa OS could 

complement an enteric wzz mutation, £. coli strain CLM4. which is deleted for O-antigen 
genes including wzz (Marolda and Valvano, 1993), was used. CLM4 was transformed with 
either pSS37 (containing the O-antigen biosynthetic genes from S. dysenteriae type I 
without a wzz gene alone, or with both P SS37 and P FV401, containing P. aeruginosa OS wzz 
While LPS from E. coli CLM4/pSS37 showed an unregulated distribution of chain lengths 
LPS from E. coli CLM4/pSS37/pFV401 showed a restoration to modality, with a decrease in' 
short and very long O chains, and an increase in chains with approximately 10-20 repeats 
(Fig. 47A). 

The core oligosaccharide of the E. coli K-12 hybrid strain HB101, but 
not K-12 itself, can act as an acceptor for P. aeruginosa O antigens (Goldberg et al 1992 
L.ghtfoot and Lam, 1993). The structure of the HB101 core has not been elucidated 
Although £. coli HB101 carrying pFVlOO had previously been shown to express LPS which 
could be recognized by B-band-specific MAb MF15-4, its chain-length regulation had not 
been examined. pFVlOO is now known to contain a truncated wzz gene. The expression of LPS 
from E. coli HB101 carrying both pFVlOO and the complete OS wzz gene on pFV401 was 
examined. E. coli HB101 carrying pFVlOO alone expressed an OS O antigen with modulated 
short-chain O-antigen molecules (Fig. 47B). When both pFVlOO and pFV401 were present 
m E. coli HB101, a dual LPS banding pattern was visible on Western immunoblots (Fig 47B) 
The compression of both E. coli and P. aeruginosa Wzz proteins resulted in a major group of 
short O chains attributable to HB101 Wzz, and a minor group with longer chains 
attributable to the P. aeruginosa 05 Wzz protein. 

The identification of the rpsA and himD genes upstream of wzz 
completes the delineation of the region of serogroup-specific DNA responsible for encoding 
the B-band LPS O antigen of P. aeruginosa OS and related serotypes. The entire OS wop 
caster is thus bounded by HimD on the 5 end and uvrB on the 3 end and is approximately 
24.3 kb from the start of wzz to the end of wopN. The serogroup-specific portion is 
approximately 18.4 kb from the start of wzz to the end of wbpL. Unlike enteric O-anUgen 
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(rfb) clusters, the wbp cluster is not flanked by his and gnd t although there are two his 
genes, hisH and hisF, located in the center of the cluster. The location of xvzz upstream of 
the wbp cluster in P. aeruginosa is opposite to that in many enteric bacteria, where wzz is 
located downstream of the Oantigen cluster (Batchelor et al., 1992; Morona et al., 1995). 
5 The presence of the rpsA and himD genes, which are highly conserved among bacterial 
species, at the junction between the serogroup-specific and common regions suggests they 
may have been the site of a past recombination event. himD encodes the p-subunit of IHF 
which has previously been shown to be involved in regulation of biosynthesis of the 
exopolysaccharide alginate (Wozniak and Ohman, 1993; Wozniak, 1994). 

10 The presence of a functional wzz gene in P. aeruginosa OS confirms that 

both the O-antigen polymerase, Wzy, and Wzz are required for expression of the 
heteropolymeric B-band O antigen, as predicted by current models. Growing evidence 
suggests that Wzz proteins may also play a role in the modulation of the length of capsular 
exopolysaccharide polymers (Bik et al., 1996; Dodgson et al., 1996; Franco et al., 1996). A 

15 possible homologue of the third component cf Wzy-dependent systems, Wzx, is present in 
the wbp cluster (Burrows et al., 1996). 

The LPS banding pattern of enteric wzz mutants consists mainly of 
short O chains with steadily decreasing amounts of longer chains (Fig. 47A). In contrast, 
neither the 05 nor the 016 wzz mutants display this typical wzz phenotype, and the 016 

20 mutant in particular continues to display some chain length regulation. It is possible that 
chain length regulation in P. aeruginosa is not simply dependent on ivzz. In the case of 016, 
there may be a second wzz gene present in the 016 chromosome whose activity is normally 
masked by the wzz of the 05 serogroup. Complementation of the OS and 016 mutants by 
wzz on a multicopy plasmid gave rise to strains whose LPS appeared even more tightly 

25 regulated for size than that of the parent strains, since the complemented wzz mutants 
lacked both short- and very long-chain modal groups, and had an increase in 
medium-length groups. One possible interpretation of these results is that the regulation of 
chain length by wzz in P. aeruginosa is normally imprecise, giving rise to groups with 
multiples of the preferred chain length instead of a single group. This interpretation fits 

30 the model of Bastin et al., 1993 who suggested that multimodal distributions of chain 
lengths could result from reinitiation of polymerization without an intervening ligation 
step. 

Complementation of the 016 mutants by the OS xvzz gene restored them 
to a phenotype resembling the 016 parent. Contrary to the findings of Bastin and 
35 colleagues, 1993, these results show that in these closely-related serotypes, the structure of 
the O antigen, or possibly difference in the 05 '*s 016 genetic background, d termines the 
preferred O-antigen chain length. While the 016 wzz and wzy genes have not been 
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isolat d, they are probably highly similar to those of OS based on the results of 
high-stringency Southern blot analysis. The analysis of «*, from the related serotypes 02 
and 05 demonstrated that the genes are essentially identical. 

The P. aeruginosa OS Wzz protein can modulate expression of both 
5 homologous (P. aeruginosa OS) and heterologous (S. tysenteriae) O antigens in £. coli 
although it has only 20% identity with the Wzz protein of £. coli. The ablility of P 
aeruginosa Wzz to modulate a heterologous O antigen is consistent with previous work 
showmg Wzz is not specific for O-antigen type. When £. coli and P. aeruginosa Wzz 

10 iT^rr^* * " C0K ' *" m0dUlatin8 effeCt ° f ^ ^ Predominates 
although the P. aeruginosa u,zz is present in multicopy. This difference can be seen in the 

increased proportion of short O chains versus longer O chains which are expressed Despite 
varans in efficacy, it appears that the Wzz proteins from different Gram-negative 
femdies function in an analogous manner and can act as interchangeable components of the 
O-antigen assembly complex. 

15 ^ abili 'y ° f Wz2 < W *y ™* WaaL proteins with divergent primary 

sequences to act reciprocally suggests that they are interacting through recognition of 
common, conserved structural features. Although the amino acid similarities between the 
Wzz proteins are low, their secondary structures are alike (Fig. 44). Similarly, although 
the primary sequence similarities of the Wzy proteins from a number of bacteria are poor, 
all have highly similar secondary structures containing multip.e membrane-spanning 
domams (Cryz et al., 1984). Comparison of the WaaL proteins from £. coli and S. enterica sv 
Typhimurium, the only O-antigen ligases characterized to date, show that they too hav, 
conserved secondary structures, but less than 20% primary sequence homology (Liu and 
Wang, 1990). In light of this information, it is now possible to target conserved structural 
features of these proteins for modification in order to further define the areas critical for 
putative protein interactions. 

Having illustrated and described the principles of the invention in a 
preferred embodiment, it should be appreciated to those skilled in the art that the 
mvention can be modified in arrangement and detail without departure from such 
pnnaples. We claim all modifications coming within the scope of the following claims. 

All publications, patents and patent applications referred to herein 
are incorporated by reference in their entirety to the same extent as if each individual 
pubhcation, patent or patent application was specifically and individually indicated to be 
incorporated by reference in its entirety. 

Below full citations are set out for the references referred to in the 
specification and detailed legends for the figures are provided. 
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The application contains sequence listings which form part of the 

application. 
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Pseudomonas aeruginosa serotype 05 wbp gene cluster. 



locus base positions %G+C MW encoded AArf 



pl e H.U 



WZZ* 

wbpA 

wbpB 

wbpC 

wbpD 

wbpE 

wzy b 

wbpF 

hisH 

hisF 

wbpG 

wbpH 

wbpl 

wbpj 

wbpK 

wbpL* 

IS 1209 

wbpM 

wbpN 

uvrB a 



1-479 
1286-2596 
2670-3620 
3689-5578 
5575-6066 
6152-6982 
7236-8552 
8549-9499 
9831-10388 
10388-11143 
11281-12411 
12427-13548 
13545-14633 
14651-15892 
15889-16851 
16911-17822 
17935-19144 
19678-21675 
22302-23693 
23704-24417 



49.5 
54.5 
52.8 
53.1 
53.9 
52.8 
44.6 
49.0 
49.3 
50.0 
44.5 
45.6 
50.2 
54.5 
56.8 
55.5 
59.3 
61.9 
63.6 
61. 



38.6 kDa 
48.2 kDa 

35.8 kDa 

69.9 kDa 

17.4 kDa 
29.9 kDa 
48.9 kDa 

33.8 kDa 

20.9 kDa 

27.5 kDa 
43.4 kDa 
42.0 kDa 
39.7 kDa 

45.3 kDa 

34.4 kDa 
32.9 kDa 

nd 

74.5 kDa 
48.5 kDa 
26.7 kDa 



158 

436 

316 

629 

163 

276 

438 

316 

185 

251 

376 

373 

362 

413 

320 

303 

n/a 

665 

463 

238 



nd 
5.36 
6.40 
9.06 
8.25 
5.26 
9.63 
9.49 
nd 
nd 
8.15 
8.79 
5.40 
6.54 
9.03 
9.08 
n/a 
9.33 
6.12 
nd 



nd 
-0.08 
-0.27 
0.48 
0.19 
-0.01 
0.80 
0.99 
nd 
nd 
-0.38 
-0.21 
0.06 
0.10 
0.14 
0.84 
n/a 
0.09 
-0.09 
nd 



distributions 

2. 5, 16. 18, 20 
2, 5. 16. 18.20 
2. 5. 16. 18.20 
2, 5. 16. 18. 20 
2. 5. 16. 18. 20 
2. 5. 16. 18. 20 
2. 5. 16. 18, 20 
2, 5, 16. 18, 20 
2, 5. 16, 18. 20 
2. 5. 16, 18, 20 
2, 5. 16, 18. 20 
2, 5, 16, 18, 20 
2, 5. 16. 18. 20 
2. 5, 16. 18. 20 
2, 5, 16. 18, 20 
2. 5. 16. 18,20 
1 to 11, 13 to 20 
1 to 20 
1 to 20 
1 to 20 



a truncated ORF 
b de Kievit et al. (1995) 

' wbpL was originally named rfbA; Dasgupta and Lam (1995) 
d number of amino acids 

si^r of protti °' caicuiattd Ge " eRu,mer f °' 

< Mropaftic index of Oe protein, caieulaed using DNAsis fo, Windows (Hi, ac hi 

P„s,ive v*„es indie. pr01ein is nydr()phobic ^ 
indicate the protein is hydrophilic. 

nvo!^ li0n 8Cne am ° ng 20 Ser0tyPCS ° f «— - positive 

hybndaauon ,„ high-stringency Southern blot analysis. 
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TABLE3 



Amino acid homologies of HisH proteinT 





PA 


AB 


PA 


100.0 




AB 


53.6 


100.0 


EC 


56. 1 


47.4 


HI 


51.8 


47.9 


LL 


51.0 


52.6 


SC 


54.9 


47.9 


ST 


54.7 


43.2 



EC HI H SC~ 

100.0 I I 

63.3 100.0 

50.0 52.3 100.0 

25*4 ill 480 ioo.o 

92.2 60.9 45.4 49.5 



ST 



GENE PAIJGN ?^£Z™7o£tl^^£ fcf c 7* 
5; window size = 10; open gap cost = 10- J&tSSSt hv«w ■ f ; gap s 
numbers shown are a suimSonof identi^ JS P ™ "J* flkcnn g leve l - 2.5. The 
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TABLE 4 



Amino acid homologies of HisF proteins. 





Pa 


Ab 


Ec 


Hi 


Kp 


U 


Rs 


St 


Pa 


100.0 
















Ab 


51.4 


100.0 














Ec 


48.2 


56.2 


100.0 












Hi 


50.6 


52.3 


87.2 


100.0 










Kp 


49.8 


55.5 


97.7 


86.4 


100.0 








U 


53.7 


70.1 


58.6 


57.0 


58.6 


100.0 






Rs 


44.6 


81.3 


54.8 


46.8 


54.0 


63.2 


100.0 




St 


49.4 


56.5 


97.3 


87.6 


96.5 


58.6 


55.2 


100.0 



, Amino acid homologies of HisF proteins from various bacterial species. 
The amino acid sequences of various HisF proteins were aligned pairwise using the 
PC/GENE P ALIGN program with the following parameters: K-tuple value = 1; gap 
penalty = 5; window size = 10; open gap cost = 10; unit gap cost = 10; filtering level = 2.5. 
The numbers shown are a summation of identical and conserved amino acid residues. Key : 
Pa, Pseudomonas aeruginosa OS psb cluster HisF; Ab, Azospirillum brazilense HisF; Ec, 
Escherichia coli HisF; Hi, Haemophilus influenzae HisF; LI, Lactobacillus lactis HisF; Rs, 
Rhodobacter sphaero4ides HisF; and St, Salmonella enterica typhimurium HisF. 
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TABLF g 



Pairwise comparison of Rol 



amino acid homologies 1 * 2 



PA 
ECl 
EC2 
SF 
ST 



PA 
100.0 



ECl 

34.4 
100.0 



EC2 

35.1 
79.3 
100.0 



SF 

35.4 
79.0 
98.1 
100.0 



ST 

32.8 
78.6 
81.5 
81.2 
100.0 



' Analyses were done using PCGENE PALIGN program. 



*^SS^.^ZJ2 a *° l; EC1 " * coli 075 R <* EC2 « * Oil I CLD- SF 

1993) to describe^san.^^fRorp^^ 1 ^" by SOmc (Bas.in « a/ 
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TABLE 6 

Bacterial strains and plasmids 


.„_:_. 


II Strain or plasmid 


Genotype or relevant characteristics 


Reference or source 


IIP. aeruginosa 






PAOl 


serotype 05, A + , B + 


Hancock and Carey (1979) 


|aK1401 


mutant of OT684", A*, B-band contains core + 
one O-repcat unit (SR) 


Deny ana ivropinsKi ( ivtsoj 


rd7513 


mutant of AK1401, A*. B-band contains core + 
one O-rcpcat unit (SR) 


uignuoot ana L*am { i w j j 


OP5.2 


mutant of PAOl, A" 1 ", B-band contains core + one 
O-repcat unit (SR) 


This study 


OPS. 3 


mutant of PAOl, A"\ B-band contains core + one 
O-repcat unit (SR) 


1 his study 


OP5.5 


mutant of PAOl, A" 1 ", B-band contains core + one 
O- repeat unit (SR) 


This study 


E. colt 






DH5 


supE44 hsilKI / recAi enaAJ gyrAVo 
thi-l relAI 


GIBCO/Bcthcsda Research 
Laboratories 


HB101 


suph4*t nsa$£U{t gin g> tecnij ara*iH pront 

lacYI galK2 rpsL20 xyl-5 mtt-i 
P Str R 


Boycr and Roulland-Dussoix 
(1969) 


SM10 


thi-J thr leu lonA iacY supE recA RP4-2-Tc::Mu 
Km R 


Simon et at. (1983) 


Plasmids 






pFVlOO 


pCP13 derivative containing cloned PAOl O- 
antigen biosynthctic genes on a 26 kb insert 


1 iphtfoot and Lam fl993* 


pCPl3 


RK2 derivative cos , Mon T , ira , lc n K^m^ 


Darzins and Chakrabarty (1984) 


pRK404 


RK2 derivative Mob\ Tra\ Tc R 


Ul 111* CI Ui . \ I 7uJ J 


pUCP26 


pUCI8-derivcd broad-host-range vector, Tc R 


West ei ui M994) 


pEXlOOT 


gene-replacement vector, orfl^, SacB+* Ap K 


Schwcizer and Hoang 
(submitted) 


pUCPGM 


source of Gm K cassette; Ap K Gm R 


Schweizer(I993) 


pBluescnpt KS 
(+/-) 


Ap K 


PDI Biosciences, Aurora, ON 



"OT684 is the immediate progenitor strain of AK1401 and is a resinctionless mutant of PAOl 
(Potter and Loutit, 1982). 
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TABLE 7 

Rfc proteins off. aeruginosa and other gram-negati 



ve organisms 




"Molecular weight based on nucleotide sequence. 

"Hydropathy index deduced from hydrophobic ity analysis (Kyte and DoolitUc. 1982). 
5 'Percentage of the bases G and C in the coding sequence. 
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TABLE 8 

Bacterial strains and plasmids used in this study. 



Strain or plasmid Genotype, phenotype or properties 



Reference / source 



P. aeruginosa 
OS 

OS wzz 
IATS 016 
016 wzz 



strain PAOl, wild type A+ B+ 20 

PAOl, wzz insertion mutation at Xhol; A+ B+ this study 

Serotype Ol6 wild type A- B+ 33 

Serotype 016 wzz insertion mutation at Xhol; A- B+ this study 



JM109 

SM10 
HB101 

CLM4 



recAl sitpE44 endAl hsdRU g\/rA96 relAl thi 53 
(lac-proAB F[tra D36, proAB*] lacll, /acZ(M15] 

thi-1 thr leu tonA lacY supE recA RP4-2-Tc::Mu, Km R 45 

F- thi-1 hsdS20 set A oral 4 proAl lacYl galK2 rpsL20 27 
xyl mtl-1 $upE44 recA13 leuB6 StrR 

lacZ2286 trp~49 ((sbcB-rfb)86 upp-12 rclAl rpsllSO (- 35 
recA 



Pla?mid? 






pFVlOO 


24.4 kb Xhol fragment in cosmid pCP13; contains the 
wbp cluster 


8/31 


pFV400 


25.0 kb Sau3Al fragment in pCP13; overlaps pFVlOO 


this study 


pFV401 


2.3 kb Hindlll fragment in pBluescript II SK; contains 
the P. aeruginosa 05 wzz gene 


this study 


pFV401-26 


same insert in pUCP26 


this study 


pFV401TGm 


same insert in pEXlOOT, with Gm R cassette inserted at 
unique Xhol site within wzz 


this study 


pFV403 


3.0 kb Sst I fragment in pBluescript II SK; contains 5 
portion of wzz and upstream sequences 


this study 


pBluescript II SK 


2.9 kb cloning vector containing 17 promoter; Ap R 


Stratagene 


pUCP26 


4.9 kb pUC18-based broad-host-range vector; Tc R 


48 


pEXlOOT 


gene-replacement vector; oriT* , sacB* , Ap R 


44 


pUCPGM 


source of gentamicin resistance cassette; Ap K , Gm R 


44 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41234 



PCT/CA97/00295 



-75- 



53 



0\ 

w 

CO 
< 



B 

£ 

CL 
X 



o 
> 



•a 



■a 



O 
c 



N 
(sj 

55 



ON 

o 



~° £5 22 2? Sg" 25 5g Pf? 



S£ 2 s ss -~ ^ 2 1 

Sff vo^ r-o^ <>C=T vom ioST o^ST O 

-8 3£ 85 sg gjg sjg £5 | • 

^ ? r^ST >©<© cno ° 



*")0) CM CM ON 

On CM tri o — rr> 
- no oo cm 



On \o cm vo* 

OO m « rvi 
OO ON 00 



o 
© 
o 



VOOO 00 \0 OO VO ON O 
On cm rr qo ttK rsi— : O 
— vo cm m r** oo ° 



O vn 
cm m 



<n on 
vo 



rn o 
O cm 



-So P 
*rt vo O 
— cm (Nn O 



sis 2 



ON 

O 



N 
N 

CO 



S oo 
£ O 

tv5 & 



I 

u 



SUBSTITUTE SHEET (RULE 26) 



WO 97/41234 



-76- 



PCT/CA97/00295 



REFERENCES 

Alifano, P., Fani, R./ Li 6, P., Lazcano, A., Bazzicalupo, M., Stella Carlomagno, M., and 
Bruni, C.B. (1996) Histidine biosynthetic pathway and genes: structure, regulation, and 
evolution. Microbiol Rev 60: 44-69. 
5 Allen and Maskell, (1996) The identification, cloning and mutagenesis of genetic locus 
required for lipopolysaccharide biosynthesis in Bordetella pertussis. Mo! Microbiol 19: 
37-52. 

Altschul, S.E., G. Warren, W. Miller, E.U. Myers, and DJ. Lipman. 1990. Basic local 
alignment search tool. J. Mol. Biol. 215:403-410. 
10 Amor, P., and L. Mutharia. (1995) Cloning and expression of rfb genes from Vibrio 
anguillarum serotype 02 in Escherichia coli: evidence for cross-reactive epitopes. Infect 
Immun 63: 3537-3542 

Arisawa, A., Tsunekawa, H., Okamura, K. and Okamoto, R. (1995) Nucleotide sequence 
analysis of the carbomycin biosynthetic genes including the 3-O-acyltransferase gene from 
15 Streptomyces thermotolerans. Biosci Biotechnol Biochem 59: 582-588. 

Arsenault, T. L., Hughes, D. W., MacLean, D. B., Szarek, W. A., Kropinski, A. M. B. and 
Lam, J. S. 1991. Structural studies on the polysaccharide portion of "A-band M 
lipopolysaccharide from a mutant (AK1401) of P. aeruginosa strain PAOl. Can J Chem 69: 
1273-1280. 

20 Bastin, D.A., G. Stevenson, P.K. Brown, A. Haase, and P.R. Reeves. 1993. Repeat unit 
polysaccharides of bacteria: a model for polymerization resembling that of ribosomes and 
fatty acid synthetase, with a novel mechanism for determining chain length. Mol. 
Microbiol. 7:725-734. 

Batchelor, R.A., P. Alifano, E. Biffali, S.I. Hull, and R.A. Hull. 1992. Nucleotide 
25 sequences of the genes regulating O-polysaccharide antigen chain length (rol) from 
Escherichia coli and Salmonella typhimurium: Protein homology and functional 
complementation. J. Bacteriol. 174:5228-5236 

Bechthold, A., Sohng, J.K-, Smith, T.M., Chu, X. and Floss, H.G. (1995) Identification of 
Streptomyces violaceorubcr Tu22 genes involved in the biosynthesis of granaticin. Mol Gen 
30 Genet 248: 610-620. 

Berry, D., and Kropinski, A. M. 1986. Effect of lipopolysaccharide mutations and 
temperature on plasmid transformation efficiency in P. aeruginosa . Can J Microbiol 32:436- 
438. 

Bik, E.M., A.E. Bunschoten, R.J.L. Willems, A.C.Y. Chang, and F.R. Mooi. 1996. Genetic 
35 organization and functional analysis of the otn DNA essential for cell-wall polysaccharide 
synthesis in Vibrio cholerae 0139. Mol. Microbiol. 20:799-811. 



WO 97/41234 



-77- 



PCT7CA97/00295 



35 



Binotto, J., MacLachlan, R., and Sanderson, K. E. 1991. Biotransformation in 
Salmonella typhimurium LT2. Can ] Microbiol 37:474-477. 

Bimbohn, H. C, and Doly, K 1979. A rapid extraction procedure for screening recombinant 
plasmid. Nucleic Acids Res. 7:1513-1523. 
5 Bogdanova. N., Bork, C, and Hell, R. (1995) Cysteine biosynthesis in plants, isolation and 
functional identification of a cDNA encoding a serine acety.transferase from Arabidopsis 
thahana. FEBS Lett 358: 43-47. 

Boyer, H. W., and Roulland-Dussoix, D. 1969. A complementation analysis of the 
restriction and modification of DNA in Escherichia coli. J Mol Biol 41:459-496 
10 Brown, P. K., Romana, L. K., and Reeves, P. R. l 992 . Molecular analysis of the rfb gene 
cluster of Salmonella serovar muenchen (strain M67), the genetic basis of the polymorphism 
between groups C2 and B. Mol Microbiol 6:1385-1394. 

Buendia, A.M., Enenkel, B., Koplin, R., Niehaus, K., Arnold W., and Piihler, A.. (1991) The 
Rhizobium meliloti exoZ/exoB fragment Q , megaplasmid ExoB ^ ^ 

UDP-glucose^pimerase and ExoZ shows homology to NodX of Rhizobium leguminosarum 
biovar viciae strain TOM. Mol Microbiol 5: 1519-1530. 

Burnette, W.N. 1981. Western blotting: elcctrophoretic transfer of proteins from sodium 
dodecyl sulphate-polyacrylamide gels to unmodified nitrocellulose and radiographic 
detection with antibody and radioiodinated protein A. Anal. Biochem. 112 195-203 
10 Burrows, L.L., D. Chow, and J.S. Lam . 1997. Pseudomonas aeruginosa B-band O antigen 
chain length is modulated by Wzz (Rol). J. Bacteriol. 179: in press 

Burrows, L.L., D.F. Charter, and J.S. Lam. 1996. Molecular characterization of the 
Pseudomonas aeruginosa serotype OS B-band ^polysaccharide gene cluster. Mol 
Microbiol. 22:481-495. 

5 Collins, L. V., and Hackett, J. 1991. Molecular cloning, characterization, and nucleotide 
sequence of the rfc gene, which encodes an O-antigen polymerase of Salmonella 
typhimurium. J Bacteriol 173:2521-2529. 

Comstock, L.E., Johnson, J.A., Michalski, J.M., Morris, J.G., ,,, and Kaper,,.P. (,996) Cloning 
and sequence of a region encoding a surface polysaccharide of Vibrio cholerae Ol39 and 
characterization of the insertion site in the chromosome of Vibrio cholerae Ol Mol 
Microbiol 19: 815-826. 

Cryz, S.J. Jr., T.L. Pitt, E. Purer, and R. Germanier. 1984. Role of lipopolysaccharide in 
virulence of Pseudomonas aeruginosa. Infect. Immun. 44:508-513 

Daniels D.L., Plunkett, G., Bur.and, V., and Blattner, F.R. (1992) Analysis of the 
Eschenchia coli genome: DNA sequence of the region from 84.5 to 86.5 minutes. Science 257: 
7/1 -778. 



WO 97/41234 



-78- 



PCT/CA97/00295 



Darzins, A., and Chakrabarty, A. M. 1984. Cloning of genes controlling alginate 
- biosynthesis from a mucoid cystic fibrosis is late of P. aeruginosa. J Bacteriol.l 59:9-1 8. . . . _ 

Dasgupta, T., and Lam, J. S. Identification of putative rfb genes involved in B-band 

lipopolysaccharide biosynthesis in P. aeruginosa serotype OS. Submitted for publication. 
5 Dasgupta, T., and J.S. Lam. (1995) Identification of rfb A, involved in B-band 

lipopolysaccharide biosynthesis in Pseudomonas aeruginosa serotype OS. Infection and 

Immunity 63: 1674-1680. 

Dasgupta, T., Malburg, S., and Lam, J. S. 1993. Program Abstr 93rd Gen Meet Amer Soc 
Microbiol abstr. D-240. 

10 Davis, E.O., Evans, LJ. and Johnston, A.W. (1988) Identification of nodX, a gene that 
allows Rhizobium leguminosarum biovar viciae strain TOM to nodulate Afghanistan peas. 
Mol Gen Genet 212: 531-535. 

Denk, D. and Bock, A. (1987) L-cysteinc biosynthesis in Escherichia coli: nucleotide 
sequence and expression of the serine acetyltransferase (cy$E) gene from the wild-type and a 
15 cysteine-excreting mutant. / Gen Microbiol 133: 515-525. 

de Kievit, T.R., T. Dasgupta, H. Schweitzer, and J.S. Lam. 1995. Molecular cloning and 
characterization of the rfc gene of Pseudomonas aeruginosa (serotype OS). Mol. Microbiol. 
16:565-574. 

de Kievit, T.R., and J.S. Lam. 1997. Pseudomonas aeruginosa rfc genes of serotypes 02 and 
20 OS could complement O-polymerase deficienct SR mutants of either serotype. FEMS 
Microbiol. Letters, in press. 

de Kievit, T. R., and Lam, J. S. 1994. Program Abstr 94th Gen Meet Amer Soc Microbiol 
abstr. D-192. 

de Kievit, T. R., Dasgupta, T., Schweizer, H., and Lam, J.S. (1995) Molecular cloning and 
25 characterization of the rfc gene of Pseudomonas aeruginosa (serotype OS). Mol Microbiol 16: 
565-574. 

de Lencastre, H., Chak, K.-F., and Piggot, P. J. 1983. Use of Escherichia coli transposon 
Tnl 000 (yb) to generate mutations in Bacillus sitbtilis DNA. / Gen Microbiol 129:3202-3210. 
Delic-Attree, L, B. Toussaint, and P.M. Vignais. 1995. Cloning and sequence analyses of the 
30 genes coding for the integration host factor (IHF) and HU proteins of Pseudomonas 
aeruginosa. Gene 154:61-64. 

Deretic, V., Gill, J.F., and Chakrabarty, A.M. (1987) Gene algD coding for GDPmannose 
dehydrogenase is transcriptionally activated in mucoid Pseudomonas aeruginosa. J 
Bacteriol 169: 351-358. 

35 Dhillon, N., Hale, R.S., Cortes, J., and Leadlay, P.F. (1989) Molecular characterization of 
a gene from Saccharopolyspora erythraea {Streptomyces crythraeus) which is involved in 
erythromycin biosynthesis. Mol Microbiol 3: 1404-1414. 



WO 97/41234 



PCT/CA97/00295 



-79 



10 



35 



Ditta, G., Schmidhauser, T., Yakobson, E., Su, P., Liang, X.-W., Finlay, D. R., Guiney, D 
and Helinski, D. R. 1985. Plasmids related to the broad host range vect r, P RK290, useful 
for gene cloning and for monitoring gene expression. Plasmid 13:149-153. 
Dodgson, O, P. Amor, and C Whitfield. 1996. Distribution of the rol gene encoding the 
regulator of ^polysaccharide O-chain length in Escherichia coli and its influence on the 
expression of group I capsular K antigens. J. Bacteriol. 178:1895-1902. 

Dodgson, C, P. Amor, and C. Whitfield. 1996. Distribution of the rol gene encoding the 
regulator of ^polysaccharide O-chain length in Escherichia coli and its influence on the 
expression of group I capsular K antigens. J. Bacteriol. 178:1895-1902. 

Dubray, G., and G . Bezard. 1982. A highly sensitive periodic acid-silver stain for 1 2-diol 
groups of glycoproteins and polysaccharides in polyacrylamide gels. Anal Biochem 
119:325-329. 

Falah, M. and R. S. Gupta. 1994. Cloning of the hs P 70 { dnaK) genes from Rhizobium 
mehloU and Pseudomonas cepacia: phylogenctic analyses of mitochondrial origin based on 
15 a highly conserved protein sequence. J Bacteriol 176: 7748-7753. 

Farinha, M. A., and Kropinski, A. M. 1990. High efficiency electroporation of P. aeruginosa 
using frozen cell suspensions. FEMS Microbiol Lett 70:221-226. 

Fleischmann, R.D., Adams, M.D., White, O., Clayton, R.A., Kirkness, E.F., Kerlavage 
A.R.,Bult,C.J.,Tomb, J ,F., Dougherty, B.A., Merrick, J.M., McKenney, K., Sutton, G., 
FitzHugh, W., Fields, C.A., Gocayne, J.D., Scot,, J.D., Shirley, R., Liu, L,,., Glodek, A., 
Kelley, J.M., Weidman, J.F.. Phillips, C.A., Spriggs, T., Hedblom, E., Cotton, M..D., 
Utterback, T.R., Hanna, M.C., Nguyen, D.T., Saudek, D..M., Brandon, R.C., Fine LD 
Fntchman, J.L., Fuhnnann, J.L., Geoghagen, N.S.M., Gnehm, C.L., McDonald, L.A., Small' 
K.V., Fraser, CM., Smith, H.O. and Venter, J.C. (1995) Whole-genome random sequencing 
and assembly of Haemophilus influenzae Rd. Science 269: 496-512. 

Franco, A.V., D. Liu, and P.R. Reeves. 1996. A Wzz (Cld) protein determines the chain 
length of K lipopolysaccharide in Escherichia coli OS and 09 strains. J. Bacteriol 
178:1903-1907. i a " enoL 

Gagnon, Y., Breton, R., Putzer, H., Pelchat, M., Grunberg-Manago, M., and Upointe, J 

(1994) Clustering and co-transcription of the Bacillus subtilis genes encoding the 
aminoacyl-tRNA synthetases specific f or gl utamate and for cysteine and ^ ^ 
for cysteme biosynthesis. / Biol Chem 269. 7473-7482. 

Gish, W., and D.J. States. 1993. Identification of protein coding regions by database 
similarity search. Nature Genet. 3.266-272. 

Glaser, P., Kunst, F., Arnaud, M., Coudart, M.-P., Gonzales, W., Hullo, M.-F., 1 nescu M 
Lubochinsky, B., Marcelino, L., Moszer, I., Presecan, E., Santana, M., Schneider' E ' 
Schwe,zer, J., Vertes, A., Rapoport, G., and Danchin, A.. (1993) Bacillus subtilis genome 



20 



25 



30 



WO 97/41234 



-80- 



PCT/CA97/00295 



project: cloning and sequencing of the 97 kb region from 325* to 333\ Mol Microbiol 10: 
.371-384. 

Glucksmann, M.A., Reuber,T.L., Walker, G.C. (1993) Gen_s needed for the modification, 
polymerization, export and processing of succinoglycan by Rhizobium meliloti: a model for 
5 succinoglycan biosynthesis. / Bacteriol 175: 7045-7055. 

Gdhmann, S., Manning, P.A., Alpert, C.A., Walker, M.J., and Timmis, K.N. (1994) 
Lipopolysaccharide Oantigen biosynthesis in Shigella dysenteriae serotype 1: analysis of 
the plasmid-carried rfp determinant. Microb Pathog 16: 53-64 

Gold, L,, and Stormo, G., (1987) Transcriptional initiation. In Escherichia coli and 
10 Salmonella typhimurium: Cellular and Molecular Biology, Vol. 2. Neidhardt, F.C. (ed). 
Washington, D.C. American Society for Microbiology, pp.807-876. 

Goldberg, J.B., K. Hatano, G. Small Meluleni, and G.B. Pier. 1992. Cloning and surface 
expression of Pseud omonas aeruginosa O antigen in Escherichia coli. Proc. Nat. Acad. Sci 
USA 89:10716-10720. 

15 Goldberg, J.B., and DJE. Ohman. 1984. Cloning and expression in Pseudomonas aeruginosa of 
a gene involved with the production of alginate. J. Bacteriol. 158:1115-1121. 
Goldberg, J.B., K. Hatano, G. Small Meluleni, and G.B. Pier. 1992. Cloning and surface 
expression of Pseudomonas aeruginosa O antigen in Escherichia coli. Proc. Nat. Acad. Sci 
USA 89:10716-10720. 

20 Goldman, R.C„ and L. Leive. 1980. Heterogeneity of antigenic-side-chain length in 
lipopolysaccharide from Escherichia coli Olll and Salmonella typhimurium LT2. Eur. J. 
Biochem. 107:145-153. 
Gotschlich, 1994. 

Hammerschmidt, S„ Birkholz, C, Zahringer, U v Robertson, B.D., van Putten, J., Eb el ling, 
25 O., and Frosch, M„ (1994) Contribution of genes from the capsule gene complex (cps) to 
lipooligosaccharide biosynthesis and serum resistance in Neisseria meningitidis. Mol 
Microbiol 11: 885-896. 

Hancock, R.E.W., and A.M. Carey. 1979. Outer membrane of Pseudomonas aeruginosa: 
heat- and 2-mercaptoethanol-modifiable proteins. J. Bacteriol. 158: 1115-1121. 
30 Harley, OB. and R. P. Reynolds (1987) Analysis of E. coli promoter sequences. Nucleic 
Acids Res 15: 2343-2361. 

Hashimoto, Y., Li, N., Yokoyama, H. and Ezaki, T. (1993) Complete nucleotide sequence 
and molecular characterization of ViaB region encoding Vi antigen in Salmonella typhi. J 
Bacteriol 175: 4456-4465. 
35 Hitchcock, P.J., and T.M. Brown. 1983. Morphological heterogeneity among Salmonella 
lipopolysaccharide chemotypes in silver-stained polyacrylamide gels. J. Bacteriol. 
154:269-277. 



WO 97/41234 



-81- 



PCT/CA97/00295 



Holloway, B.W., Romling, U., Tummler, B. (1994) Genomic mapping of Pseudomonas 
aeruginosa PAO. Microbiology 140: 2907-2929. 

H lloway, B.W., U. Rmling,- and B. Tmmler. 1994. Genomic mapping of Pseudomonas 
aeruginosa PAO. Microbiology 140:2907-2929. 
5 Huang, J., and Schell, M. (1995). Molecular characterization of the eps gene cluster of 
Pseudomonas solanacearum and its transcriptional regulation a. a single promoter Mol 

Microbiol 16: 977-989. 

Huff, J.P., BJ. Grant, C.A. Penning, and K.F. Sullivan. 1990. Optimization of routine 
transformation of Escherichia coli with plasmid DNA. Biotechniques 9-570-577 
10 Jarosik, G. P. and E. J. Hansen. 1994. Identification of a new locus involved in expression f 
Haemophilus influenzae type b ^oligosaccharide. Infect Immun 62: 4861-4867 
X. M. Jiang, B. Neal, F. Santiago, S. J. Lee, L. K. Romana & P. R. Reeves (1991). Structure 
and sequence of the rfb (O antigen) gene cluster of Salmonella serovar , VP hi mU riu m (strain 
LT2).Mol Microbiol 5: 695-713. 

15 Kao, C. C and L. Sequeira. 1991. A gene cluster required for coordinated biosynthesis of 
hpopolysaccharide and extracellular polysaccharide also affects virulence of Pseudomonas 
solanacearum. J Bacteriol 173: 7841-7847. 

Kearney, B., and Staskawicz, B.J. (1990) Characterization of IS476 and its role in bacterial 
spot disease of tomato and pepper. / Bacteriol 172: 143-148. 
20 Keenleyside W. J., M. Perry, L. Maclean, C. Poppe and C. Whitfield. 1994 A 
plasnud-encoded rfb 0:54 gene cluster is required for biosynthesis of the 0:54 antigen in 
Salmonella enterica serovar Borreze. Mol Microbiol 11: 437-448. 

Keenleyside, W.J., and Whitfield, C. (1995) Lateral transfer of rfb genes: a mobilizable 
ColEl-type plasmid carries the rfb 0:54 (0:54 antigen biosynthesis) gene cluster from 
Salmonella enterica serovar Borreze. / Bacteriol 177: 5247-5253 

Keemeyside, W.J., and C Whitfield. 1996. A novel pathway f or O-polysaccharide 
biosynthesis in Salmonella enterica serovar Borreze. J. Biol. Chem. 271:28581-28592 
Kingsley, M.T., D. W. Gabriel, G. C. Marlow & P. D. Roberts. 1993. T*e opsX locus of 
Xanthomonas campestris affects host range and biosynthesis of Hpopolysaccharide and 
extracellular polysaccharide. J Bacteriol 175: 5839-50. 

Klein, P., Kanehisa, M., and DeLisi, C. 1985. Description of one of the methods used in 
bOAP. Bwchimica et Biophysica Acta 815:468-476. 

Klena, J. D., and Schnaitman, C.A. 1993. Function of the rfb gene cluster and the rfe gene in 
the synthesis of O-antigen by Shigella dysenteriae 1. Mol Microbiol 9:393-402 
Knirel, Y. A. 1990. Polysaccharide antigens of P. aeruginosa. Crit Rev Microbiol 17:273- 



30 



304. 



WO 97/41234 



PCT/CA97/00295 



-82- 

Knirel, Y.A., and N.K. Kochetkov. 1994. The structure of lipopolysaccharides of 
Gram-negative bacteria. III. The [.structure of Q- antigens: a review. Biochemistry 
(Moscow) 59:1325-1383. 

Knirel, Y.A., E.V. Vinogradov, N.A. Kocharova, N.A. Paramonov, N.K. Kochetkov, B.A. 
5 Dmitriev, E.S. Stanislavsky, and B. Lanyi. 1988. The structure of O-specific 
polysaccharides and the serological classification of Pseudomonas aeruginosa. Acta 
Microbiol. Hung. 35:3-24. 

Kuenzler, M., Balmelli, T., Egli, CM., Paravicini, G., and Bra us, G.H. (1993) Cloning, 
primary structure, and regulation of the HIS7 gene encoding a bifunctional giutamine 
10 amid otransf erase: cyclase from Saccharomyces cerevisiae. J Bacterial 175: 5548-5558. 

Kuzio, J., and Kropinski A.M. (1983) Oantigen conversion in Pseudomonas aeruginosa 
PAOl by bacteriophage D3. / Bacteriol 155: 203-212 

Lacks, S., and J.R. Greenberg. 1977. Complementary specificity of restriction endonucleases 
of Diplococcus pneumoniae with respect to DNA methylation. J. Mol. Biol. 114: 153-168. 
15 Lam, M.Y.C., E.J. McGroarty, A.M. Kropinski, L.A. MacDonaid, S.S. Pedersen, N. Hiby, and 
J.S. Lam. 1989. Occurrence of a common lipopoJysaccharide antigen in standard and clinical 
strains of Pseudomonas aeruginosa. J. Clin. Microbiol. 27:962-967. 

Lam, J.S., M.Y.C. Handelsman., T.R. Chi vers, and L.A. MacDonaid. 1992. Monoclonal 
antibodies as probes to examine serotype-spccific and cross-reactive epitopes of 
20 lipopolysaccharides from serotypes 02, OS, and 016 of Pseudomonas aeruginosa. J. 
Bacteriol. 174:2178-2184. 

Lai, C.-Y. and Baumann, P. (1992) Sequence analysis of a DNA fragment from Buchnera 
aphidicola (an endosymbiont of aphids) containing genes homologous to dnaC, rpoD, cysE, 
and secB. Gene 119: 113-118. 
25 Lightfoot, J.L., and J.S. Lam. 1991. Molecular cloning of genes involved with expression of 
A-band lipopolysaccharide, an antigenically conserved form, in Pseudomonas aeruginosa. J. 
Bacteriol. 173:5624-5630. 

Lightfoot, J.L., and J.S. Lam. 1993. Chromosomal mapping, expression and synthesis of 
lipopolysaccharide in Pseudomonas aeruginosa-, a role for guanosine diphospho 
30 (GDP)-D-mannose. Mol. Microbiol. 8:771-782. 

Liu, D., R.A. Cole, and P. R. Reeves. 1996. An O-antigen processing function for Wzx (Rfb>' v : 
a promising candidate for O-unit flippase. J. Bacteriol. 178:2102-2107. 

Liu, P.V. and S. Wang. 1990. Three new major somatic antigens of Pseudomonas aeruginosa. 
J. Clin. Microbiol. 28:922-925. 
35 Lin, W.S., Cunneen, T. and Lee, C.Y. (1994) Sequence analysis and molecular 
characterization of genes required for the biosynthesis of type 1 capsular polysaccharide in 
Staphylococcus aureus. J Bacteriol 176: 7005-7016. 



WO 97/41234 



PCT/CA97/0029S 



-83- 



Liu, P. V., Matsumot , H., Kusama, H.. and Herman, T. 1983. Survey of heat-stable major 
somatic antigens of P. aeruginosa, Int J Syst Bacterial 33:256-264. 

Macpherson, D.F., Manning, P.A., and Morona, R. ( 19 94) Characterization of the dTDP 
^osebiosvnmethicgen^^ Mo , Microtia, 

11: 281-292. 

MacLachlan, P.R., S.K. Kadam, and K.E. Sanderson. ,991. Cloning, characterization, and 

SeqUCnCe ° f Ae r f° LK "S™ '<* lipopolysaccharidc synthesis in Salmonella 
typhtmurtum LT2. J. Bacteriol. 173:7151-7163. 

Makela,P.H.,andStocker,B.A.D. 1984. Genetics of ^polysaccharide, p. 59-137 In E 
T. Rietschel (ed.). Handbook of endotoxin, vol. 1. Elsevier Science Publishing, Amsterdam 
Marolda, C.L., and M.A. Valvano. 1993. Identification, expression, and DNA sequence of 
the GDP-manose biosynthesis genes encoded by the 07 rfb cluster of strain VW187 
{Escherichia coli 07:Kl). J. Bacteriol. 175:148-158. 

Marolda, C.L., and Valvano, M.A. (1995) Genetic analysis of the dTDP-rhamnose 
tuosynthesis region of the Escherichia coli VW187 (07:K1) rfb gene duster: identification 
of functional homologs of rfbB and rfbA in the r// cluster and correct location of the rffE 
gene. / Bacteriol 177: 5539-5546. 

May, T.B., D. Shinabarger, R. Maharaj, J. Kato, L. Chu, J.D. DeVault, S. Roychoudhury 
N.A. Zielinski, A. Berry, R.K. Rothmel, T.K. Misra, and A.M. Chakrabarty 199l' 
Algmate synthesis by Pseudomanas aeruginosa: a key pathogenic factor in chronic 
pulmonary infections of cystic fibrosis patients. Clin. Microbiol. Rev. 4191-206 
Meier-Dieter, U., Barr, K., Starman, R., Hatch, L. and Rick, P.O. (1992) Nucleotide 
sequence of the Escherichia coli rfe gene involved in the synthesis of enterobacterial 
common antigen: Molecular cloning of the r/e-tfgene cluster / Biol Chen, 267: 746-753 
Morona, R., Mavris, M., Fallarino, A., and Manning, P. A. 1994. Characterization of the rfc 
region of Shigella flexneri. J Bacteriol 176: 733-747. 

Morona, R.L. van den Bosch, and P.A. Manning. ,995. Molecular, genetic, and topological 
™-^8 0n ° f Ch3in ^ regUlati ° n ^ ShigeUa ' Ba «' ri °' 

Nurminen, M., Hellerqvis,, C. E., Valtonen, V. V., and Makela, P H. 1971. The smooth 
^polysaccharide character of 1, 4, (5), 12 and 1, 9. 12 transductants formed as hybrids 
between groups B and D of Salmonella. Eur J Biochem 22. 500-505 

Ogasawara, N., Nakai, S. and Yoshikawa, H. (1994) Systematic sequencing of the 180 
35 ^^1^ ^ ^ BaCiUUS SUbtmS Chr ° mOSOme C ° ntaininS thC re P ,icatio " DNA 



20 



25 



30 



WO 97/41234 



-84- 



PCT/CA97/00295 



Ozenberger, B.A., M. Schrodt Nahlik, and M.A. Mcintosh. 1987. Genetic organization of 
multiple fep genes encoding ferric enterobactin transport functions in Escheri chiq _cpli. J. 
Bacteriol. 169:3638-3646. 

Palleroni, N. J. 1984. Genus I. P.. p. 141-199. In N. R. Krieg and J. C. Holt, <ed.), Bergey's 
5 Manual of Systematic Bacteriology, Vol. 1, Williams and Wilkins, Baltimore. 

Peschke, U., Schmidt, H., Zhang, H.Z. and Piepersberg, W. (1995) Molecular 
characterization of the lincomycin-production gene cluster of Streptomyces lincolnensis 
78-11. Mol Microbiol 16: 1137-1156. 

Potter, A. A. and Loutit, J* S. 1982. Exonuclease activity from P. aeruginosa which is 
10 missing in phenotypically restrictionless mutants. / Bacteriol 151: 1204-1209. 

Prere, M.F., Chandler, M. f and Fayet, O. (1990) Transposition in Shigella dysenteriae: 
isolation and analysis of IS921, a new member of the IS3 group of insertion sequences. / 
Bacteriol 172: 4090-4099. 

Priefer, U.B., Kalinowski, J., Ruger, B., Heumann, W., and Puhler, A. (1989) ISRI, a 
15 transposable DNA sequence resident in Rhizobium class IV strains, shows structural 
characteristics of classical insertion elements. Plasmid 21: 120-128. 

Pritchard, A.E., and Vasil, M.L. (1990) Possible insertion sequences in a mosaic genome 
organization upstream of the exotoxin A gene in Pseudomonas aeruginosa. J Bacteriol 172: 
2020-2028. 

20 Quirk, P.G., Guffanti, A. A., Clejan, S-, Cheng, J., and Krulwich, T.A. (1994) Isolation of 
Tn9I7 insertional mutants of Bacillus subtilis that are resistant to the protonophore 
carbonyl cyanide m-chlorophenylhydrazone. Biochim Biophys Acta 1186: 27-34. 
Reeves, P. (1993) Evolution of Salmonella O antigen variation by interspecific gene transfer 
on a large scale. Trends Genet 9: 17-22. 

25 Reeves, P.R., M. Hobbs, M. Valvano, M. Skurnik, C. Whitfield, D. Coplin, N. Kido, J. 
Klena, D. Maskell, C. Raetz, and P. Rick. 1996. Proposal for a new nomenclature for 
bacterial surface polysaccharide genes. Trends Microbiol. 4: 495-503. 

Rieder, B v Merrick, M.J., Castorph, H., Kleiner, D. (1994) Function of hisF and hisH gene 
products in histidine biosynthesis. / Biol Chem 269: 14386-14390. 
30 Rivera, M v Bryan, L. E., Hancock, R. E. W. and McGroarty, E. J. 1988. Heterogeneity of 
lipopolysaccharides from P. aeruginosa: analysis of lipopolysaccharide chain leng: i. / 
Bacterial 170:512-521. 

Rivera, M., T.R. Olivers, J.S. Lam, and E.J. McGroarty. 1992. Common antigen 
lipopolysaccharide from Pseudomonas aeruginosa AK1401 as a receptor for bacteriophage 
35 A7. J. Bacteriol. 174:2407-2411. 



WO 97/41234 



-85- 



PCT/CA97/00295 



35 



Rossbach, S., D. A. Kulpa, U. Rossbach and F. J. de Bruijn (1994) Molecular and genetic 
characterization of the rhizopine catabolism (mocABRC) genes of Rhizobium meliloti 
L5-30. Mol Gen Genet 245: 11-24. 

Ruvkim, G. B., and Ausubel, F. M. 1981. A general method for site-directed mutagenesis in 
5 prokaryotes. Nature (London) 289:85-88. 

Schnaitman. CA., and ,.D. Klena. 1993. Genetics of ^polysaccharide biosynthesis in 
entenc bacteria. Microbiol. Rev. 57: 655-682. 

Schnier, J., M. Kimura, K. Foulaki, A.R. Subramanian, K. Isono, and B. Wittmann-Liebold. 

1982. Pnmary structure of Escherichia coli ribosomal protein SI and of its gene rpsA. Proc 

10 Natl. Acad. Sci. U.S.A. 79:1008-1011. 

Schweizer, H. P. 1993. Small broad-host-range gentamycin resistance gene cassettes for 
site-specific insertion and deletion mutagenesis. BioTeclmiques 15:831-833 
Schweitzer, H.P., and T.T. Hoang. 1995. An improved system for gene replacement and 
xy/E fusion analysis in Pscudomonas aeruginosa. Gene 158:15-22. 

15 Segal G. and E. Z. Ron (1995) The dnaKJ operon of Agrobacteriun, lumefaciens: 
transcriptional analysis and evidence for a new heat shock promoter J Bacteriol 177- 
5952-5958. 

Simon, R., Priefer, U., and Punier, A. 1983. A broad-host-rangc mobilization system for in 
vivo genetic engineering: transposon mutagenesis in gram negative bacteria. 

20 Bio/Technology 1:784-791 . 

Skurnik, M., Venho, R., Toivanen, P., and Alhendy, A. (1995). A novel locus of Yersinia 
enterocolitica serotype 0:3 involved in ^polysaccharide outer core biosynthesis. Mol 
Microbiol 17: 575-594. 

Sokol, P.A., Luan, M.Z., Storey, D.G., and Thirukkumaran, P. (1994) Genetic rearrangement 
associated with in vivo mucoid conversion of Pseudomonas aeruginosa PAO is due to 
insertion elements. / Bacteriol 176: 553-562. 

Soldo, B., Lazarevic, V., Margot, P., and Karamata, D. (1993) Sequencing and analysis of 
the divergon comprising gtaB. the structural gene of UDP-glucose pyrophosphorylase of 
Bacillus subtilis 168. / Gen Microbiol 139: 3185-3195. 

Stutzman-Engwall, K.J., Otten, S.L., and Hutchinson, C.R. (1992) Regulation of secondary 
metabohsm in Streptomyces spp. and overproduction of daunorubicin in Strep.omuces 
peucehus. J Bacteriol 174: 144-154. 

Sturm, S. and K.N. Timmis. 1986. Cloning of the rfb reg.cn of Shigella dysenteriae 1 and 
construction of an rfb-rfp gene cassette for the development of lipopolysaccharide-based 
live anti-dysentery vaccines. Microb. Pathog. 1:289-297. 



25 



30 



WO 97/41234 PCT/CA97/00295 

-86- 

Tab r, S„ and C.C Richardson. 1985. A bacteriophage T7 RNA polymerase /promoter 
system for controlled exclusive expression of specific genes. Proc. Nat. Acad. Sci. USA 
: 82:1074-1078. 

Takagi, M., Takada, H., and lmanaka, T. (1990) Nucleotide sequence and cloning in 
5 Bacillus subiilis of the Bacillus stearothermophilus pleiotropic regulatory gene degT. ) 
Bacteriol 172: 411-418. 

Tercero, J.A., Espinosa, J.C., Lacalle, R.A. and Jimenez, A. (1996) The biosynthetic 
pathway of the aminonucleoside antibiotic puromycin, as deduced from the molecular 
analysis of the pur cluster of Streptomyces alboniger. J Biol Chem 271: 1579-1590. 
10 Thorson, J. S., Lo, S.F., Ploux, O., He, X., and Liu, H.-W. (1994) Studies of the biosynthesis 
of 3,6-dideoxyhexoses: molecular cloning and characterization of the asc (ascarylose) region 
from Yersinia pseudotuberculosis serogroup VA. / Bacteriol 176: 5483-5493. 
West S.E. and Iglewski, B.H. (1988) Codon usage in Pseudomonas aeruginosa. Nucleic 
Acids Res 16: 9323-9335. 

15 West, S.E.H., H.P. Schweizer, C. Dall, A.K. Sample, and L.J. Runyen-janecky. 1994. 
Construction of improved Escherichia-Pseudomonas shuttle vectors derived from pUC18/19 
and the sequence of the region required for their replication in Pseudomonas aeruginosa. 
Gene 128: 81-86. 

West, S. E. H., Schweizer, H. P., Dall, C, Sample, A. K., and Runyen-janecky, L. J. (1994) 
20 Construction of improved Escherichia-P. shuttle vectors derived from pUC18/19 and the 
sequence of the region required for their replication in P. aeruginosa. Gene 128:81-86. 
Whitfield, C. 1995. Biosynthesis of lipopolysaccharide O-antigens. Trends Microbiol. 
3:178-185. 

Whitfield, C, and M.A. Valvano. 1993. Biosynthesis and expression of cell-surface 
25 polysaccharides in gram-negative bacteria. Adv. Microb. Physiol. 35:135-246. 

Wozniak, D. J. 1994. Integration host factor and sequences downstream of the Pseudomonas 
aeruginosa algD transcription start site are required for expression. J. Bacteriol. 
176:5068-5076. 

Wozniak, D. J., and D. E. Ohman. 1993. Involvement of the alginate algT gene and 
30 integration host factor in the regulation of the Pseudomonas aeruginosa algB gene. J 
Bacteriol 175: 4145-4153. 

Wood, M.S., Byrne, A., and Lessie, T.G. (1991) IS406 and IS407, two gene-activating 
insertion sequences from Pseudomonas cepacia. Gene 105: 101-105. 

Xiao, Q. and Moore, C.H. (1993) The primary structure of phosphofructokinase from. 
35 Lactococcus lactis. Biochem Biophys Res Commun 194: 65-71. 

Yanisch-Ferron, C, J. Vieira, and J. Messing. 1985. improv d M13 phage cloning vectors 
and host strains: nucleotide sequences of the M13mpl8 and pUCl9 vectors. Gene 33: 103-119 



WO 97/41234 



-87- 



PCT/CA97/00295 



Detailed Figure Legends for Figures 22 to 29, 32, 33, and 43 to 47 

Figure 22. Silver-stained SDS-PAGE gel of LPS from PAOl, AK14G1, AK14Ol(pFV100) 
and AK1401 (pFV.TK8) (Panel A) and Westernimmunoblots of this LPS reacted with OS-' 

5 ™,w b ^ (PangI ^ NOtC th3t *• tW ° ~ i^nts strains, 
5 AK14Ol(pFV100) and AK140l(pFV.TK8>. produce levels of B-band LPS similar to the 
PAOl wild-type strain. 

Figure 23. Restriction maps of the chromosomal inserts from pFVlOO and several pFV 
subclones. Results of complementation studies of the SR mutants AK1401 and rd7513 with 
10 the pFV subclones are also shown. The three TnWOO insertions in the 1.5 kb Xhol fragment 
of P FV.TK6 that were found to interrupt O-antigen complementation in AKHOl are 
md.cated. This Xhol fragment was later purified and used as a probe in Southern blot 
analysis. Restriction sites: B, BamHl; X. Xhol; S, Spel; Xb. Xbal; H, Hindlll. 

Figure 24. Southern analysis the three rfc chromosomal mutants. OP5.2, OP5.3, and OPS 5 
15 showing the insertion of an 875 bp Cm* cassette into the rfc gene. Restriction maps of the 
PAOl wild-type (panel A) and mutant (panel B) rfc coding regions are shown. Southern 
hybndizations of chromosomal DNA from PAOl (lane 1) and mutants OP5.2. OPS 3 and 
OPS.5 (lanes 2-4, respectively) digested with Xhol were performed using an rfc prob e (panel 
O. Th. s DIG-labelled probe was generated from the 1.5 kb X/,oI insert of P FV.TK7 (shown 
m panel A). The probe hybridized to a 1.5 kb fragment of PAOl and a 2.4 kb fragment of 
thettu-ee rfc mutants. The molecular size of the probe-reactive fragments are shown on the 
left (in kb) . 

Figure 25. Silver-stained SDS-PAGE gel and Western blots of LPS from PAOl 
AK1401 and the three rfc chromosomal mutants, OP5.2, OP5.3, and OP5.5. Panel A silver- 
25 stained SDS-PAGE gel; Panel B: Western blot reacted with OS-specific MAb MF15-4- Panel 
C: Western blot reacted with A-band specific MAb N1F10. Note that the chromosomal rfc 
mutants are not able to produce long-chain O-anhgen; however, they are still expressing A- 
band LPS, like the SR mutant AK1401 . 

Figure 26. Restriction maps of recombinant plasmids P FV161, P FV401 and P FV402 The 
shaded box represents the DIG-labeled probe generated from P FV161. Restriction sites- B 
BamHl; H. HmdIII; X, Xhol. ' ' 

Figure 27. Southern hybridizations of chromosomal DNA from P AO ! (l ane 2) and rol 
mutants (lanes 3*4). Chromosomal DNA in Panel A was d lg ested with Ps/1 and Sstl. DNA 
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in Panel B was digested with Hmdffl. The samples in Panel A were probed with the Gm R 
cassette (Schweizer, 1993). The probe used in Panel B is the 2.3 kb Hindlll insert from 
pFV401. Molecular weight markers, using X DNA digested with Hindlll, are indicated to 
the left of each panel. 

5 Figure 28. Characterization of LPS from PAOl and PAOl rol chromosomal mutants. The 
samples in each lane are as labeled. Panel A is a silver-stained SDS-PAGE gel. Panel B is 
the corresponding Western immunoblot reacted with an 05 (B-band)-specific mAb MF15-4. 

Figure 29. T7 protein expression of P. aeruginosa OS Rol. This autoradiogram shows 35 S- 
labeled proteins expressed by pFV401, which contains the rol gene, and corresponding 
10 control plasmid vector pBluescript II SK in E. coli JM109DE3 by use of the T7 expression 
system. The arrow indicates the putative Rol protein. Molecular size markers are 
indicated to the left of the figure. 

Figure 32. Features of the initiation regions. Capital letters for bases indicate one of the 
following sites: potential ribosomai binding sites (RBS), the presumed start codon (also in 
15 bold and double underlined), the second codon where it is AAA (the preferred second codon), 
and components of the sequences TTAA and AAA from +10 to +13 and from -1 to -3 
respectively (Gold and Stormo, 1987). The termination codon of the preceding gene is 
indicated by a bar above if it is in the region shown. The reference sequences involved are 
also shown above the set of sequences. 

20 Figure 33. NAD-binding domains of PsbA, PsbK and PsbM aligned with those of other 
bacterial proteins involved in polysaccharide biosynthesis. The consensus sequence for an 
NAD-binding domain (Macpherson et al., 1994) is shown at the bottom of the figure in bold 
underline. The first column contains the protein names; the second column indicates the 
location of the NAD-binding site within the protein; the third column shows the alignment 

25 of the NAD-binding domains with highly conserved residues indicated in bold type; and 
the fourth column gives the reference for the protein shown. Most of the proteins in this 
group of sugar biosynthesis enzymes function as dehydrogenases /dehydratases. Note that 
PsbM, BplL, and TrsG have two putative NAD-binding domains, instead of one. The 
presence of two domains supports the proposal that these large proteins arose from fusion of 

30 two smaller proteins. 

Figure 43. Physical map of the 5 end of the wbp cluster. The wzz gene ends approximately 
800 bp upstream of wbp A, the first gene of the wbp cluster (8). The probe used to identify a 
Hindlll fragment containing the intact wzz gene for cloning into pFV401 is shown as a black 
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bar above the restriction map. The site of insertion of the gentamicin cassette used to create 
the «z knockout mutants is indicated by a black triangle. Key: B, BamHU H. Hindi* S 

Sstl; X, Xhol- 



Figure 44. Comparison of hydropathy plots of selected Wzz-like proteins The 
hydropathy plots of selected Wzz-like proteins were calculated using PC/GENE SOAP 
The X ax* represents amino acid residues, while the Y axis represents relative 
hydropathy. Positive values indicate hydrophobic^; negative values indicate 
hydrophilicity. A, P. aeruginosa OS Wzz, U50397; B, £. coli Olll Wzz, 217241- C E coli 
o349, M87049; D, E. coli FepE, P26266; E. Y. enterocolitica OS Wzz, U43708- F V 
10 pseudotuberculosis Wzz, ; G, V. cholerae 0139 OtnB, X90547. 

Figure 45. Expression of P. aeruginosa Wzz in vitro. The 40 kDa Wzz protein (indicated by 
black arrowhead) was expressed from the insert of P FV401 in both orientations. A 28 kDa 
protein was also expressed in both orientations and may represent either a breakdown 
product of the 40 kDa polypeptide, or initiation of translation from a secondary 
r*osome-binding site. There are several smaller ORFs encoded on the positive strand of 
the 2.3 kb insert of pFV401 which could correspond to the 10 kDa protein. 

Figure*, Analysis of LPS from wzz knockout mutants. LPS from P. aeruginosa serotypes 05 

72 5?lZt?r7T 5P ° ndmS ^ mUtantS eXammed F * Ure46A: Silver-stained 
20 ™t 8Ure 46B: WeStem immUn ° b,0t US *S MA » ^ specific for B-band 

LPS from the OS serogroup (serotypes 02, OS, 016, 018, O20). Figure 46C Western 
—blot using MAb MF15-4. specific for serotype OS B-band LPS. The plasmid 

:z:^T s the 05 - sene cioned d °— ° f - — - — 

Figure 47. Ability of P. ,en,gmosa OS Wz* to fusion fa £. 
» Pane, A. Silver-stained SDS-PACE gel o, E. cof, CLM4 con,,™,. the shigell . 

* *~ „„ PSS37, with and „ ithou , , he R ,„ ugines . _ ^ ta * 

JJ1 T immUn0b "" °' £ ' HB101 —""I *• OS cius*, 

- PFV.00. Witt, and „ iUlout p. ,„ uginoai ^ ^ ^ ^ ^ mmbrane ^ 

mcuba«d with MAb MF1S*. specific f„, serotype OS B-band LPS, 

30 Figure 48. Western irnrnunoblo, analysis of iipopolysaccharide (LPS) isolated using the ho, 
water-pheno, method „, Westpha, end I- Lanes OS are LPS from the p, rem 
wh,e .anes Fl and F2 are LPS fron, two mutants containing a gentamicin cassette inserted ,, 
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thc Sstl site within the open reading frame of wbpf. The monoclonal antibodies used are 
N1F10, specific for A-band LPS, and 18-19, specific for B-band LPS._ Note that a knockout 
mutation of wbpF abrogates both A-band and B-band LPS expression. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANTS: 

CA) NAME: UNIVERSITY OF GUELPH 

(8) STREET: Office of Vice President of Research. 
(O CITY: OuSS ' Reyn ° lds Building 

(D) STATE: Ontario 

(E) COUNTRY: Canada 

(F) POSTAL CODE: NIG 2W1 

(G) TELEPHONE NO. : (519) 824-4120 

(H) TELEFAX NO.: (519) 821-5236 

(A) NAME; LAM, Joseph S 

(B) STREET: 2 Bridlewood Drive 

(C) CITY: Guelph 
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(E) COUNTRY: Canada 

(F) POSTAL CODE: NIG 4A6 

(A) NAME: BURROWS , Lori 

(B) STREET : 22 Devere Drive 
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(D) STATE: Ontario 

(E) COUNTRY: Canada 

(F) POSTAL CODE: NIG 2S9 

(A) NAME: CHARTER, Deborah 

(C) ?S?f GuefpS 011656 SCreet W6St 

(D) STATE: Ontario 

(E) COUNTRY: Canada 

(F) POSTAL CODE: NIG 4S7 

(A) NAME: de KIEVIT, Teresa 

<B) STREET; 2-100 Sunny Lea Crescen- 

(C) CITY: Guelph ^rescenw 

(D) STATE: Ontario 
CE) COUNTRY: Canada 

(F) POSTAL CODE: NIG 1W6 

(ill) NUMBER OF SEQUENCES: 20 

(iv) CORRESPONDENCE ADDRESS • 

(A) ADDRESSEE: BERESKIN & PARR 

? ^? ET: 40 Kin ^ Street West 

(C) CITY: Toronto 

(D) STATE: Ontario 

(E) COUNTRY: Canada 

(F) ZIP : M5H 3Y2 

(v) COMPUTER READABLE FORM • 

(A) MEDIUM TYPE: Floppy disk 

El £2* PUTER: IBM PC compatible 

C) OPERATING SYSTEM: PC-DOS /MS-DOS 
(D) SOFTWARE : Patentln Relef se^? , version ,1.30 
(vi) CURRENT APPLICATION DATA- 

(A) APPLICATION NUMBER • PCT 

(B) FILING DATE: 

(C) CLASSIFICATION: 
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(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Kurdydyk, Linda M. 

(B) REGISTRATION NUMBER: 34,971 

(C) REFERENCE / DOCKET NUMBER: 6580-87 

(ix) TELECOMMUNICATION INFORMATION : 

(A) TELEPHONE: (416) 364-7311 

(B) TELEFAX: (416) 361-1398 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24417 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 

(B) STRAIN: PA01 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



CTCGAGATAT 


TGAGCAGCGC 


ATACAGAACT 


TGCGGAGAGA 


ATGCCAAGGC 


AGACGTGAAG 


60 


ATCGTATTGT 


TCAGCTCAAG 


GAGGCGTTGA 


AGGTCGCAGG 


TGCGCTGAAA 


TTGGAGGAGC 


120 


CTCCAC TG AT 


CAGTGGGCAA 


TCCTCTGAGG 


AGCTCTCGGC 


TATCATGAAT 


GGAAGTCTGA 


180 


TGTATATGCG 


TGGCAGTAAG 


GCGATTATGG 


CCGAGATTCA 


GACATTGGAG 


GCGCGTAGCT 


240 


CTGATGATCC 


TTTTATTCCG 


GCGTTGCGTA 


CTCTTCAGGA 


GCAGCAGTTA 


TTGCTGAGTA 


300 


GCTTGCGTGT 


TAATTCGGAG 


CGGGTTTCTG 


TTTTTCGACA 


AGACGGTCCG 


ATAGAAACGC 


360 


CGGACTCACC 


AGTTCGTCCA 


AGGAGAGCGA 


TGATTTTGAT 


TTTTGGGTTG 


ATAATTGGTG 


420 


GTGTGCTTGG 


TGGTTTTCTG 


GCGTTGTGCC 


GGATTTTTTT 


GAAGAAGTAT 


GCTCGTTAGG 


480 


AAAGAGCTAG 


TTATTGAAGT 


GGTGATGCGT 


TGCACGTACT 


TTGGTCGAGT 


AATTTTGTGG 


540 


AGTAGGTTTT 


CGTTGGGTGG 


CTCGATTGCT 


GAGGGGTGAG 


AACGTTTCCA 


TGCGGTGTTT 


600 


CCTCAGCTCT 


GTCTCCTGTG 


CCTTGGCTCC 


TTGAACGCAG 


AGGTTAACAG 


TTGAGCTGTG 


660 


GTTGTGGGTA 


TGTGACGTCT 


GTTGCGGTGG 


TGTCTGGTTC 


CTGGTGTCGG 


GTGTGCGAGA 


720 


AGATGCCAAG 


TTGCCTGGCA 


GGTCGTTACG 


TGTCGTAGCC 


GTATTCGAAG 


CTCGGCAATC 


780 


GCGGGGTGAT 


TTACAGGACT 


GTGCTTAATA 


CGGCGCAGGC 


TTGGTCAGGG 


TCGAGTCGGG 


840 


TCTTCGGGTG 


TCAACTGGAT 


CGTGCGAAAA 


CCGGTTTCGT 


GGATGCTGAT 


AAGCTCGGCT 


900 


TGACTGGCAG 


TCCAGGGCGG 


TTACCAGGTC 


TGTGGAGGCG 


CAAAATGTAT 


AGGAGCCTGC 


960 


GTGAGCTGGG 


CAGGCTGAAG 


GCCTGCTCGA 


AAGCGAGTTA 


GCATTGTGGT 


CCGGAAGGGC 


1020 


ATGGGTGGAC 


CAGAGTGCCG 


TTCTGCACGG 


CAAAAGCCAA 


CTTGCTCGGA 


GGTTCCCTAG 


1080 


CGCCTATGAT 


TACGACGCCC 


TTCATTTTTG 


GCCATTGCCG 


CCAGGTGCTG 


TGGAAAGCGA 


1140 


CAGTATCCCT 


TCTTTATCGA 


TCTTGTGAAG 


ATGTCGAGAG 


TGGTCGCAGA 


AAGGATTCAC 


1200 
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- TCGACTGACG 
AGGTAAAATT 
AAGCCGACAG 
GCGATACAAC 
CAAGCTTAAT 
CCGTGCAAGC 
GATCCTTTGT 
CAATACCACC 
TACCACCTAT 
CGTGGTTGGC 
GAACTTCGAG 
AGTCGGCATT 
GGCCGCCGAG 
CAACGAAATG 
TGCGGCGACC 
CTGTATCCCG 
CCGCTTCATC 
ACTCATGGAT 
GGGTATCGCT 
GGAGCTGATC 
CCCGAAGATG 
GGCTAGGTTC 
CAAGGCCGAA 
CATCATCAAG 
CGGATCCGCT 
GCTACATCGC 
CCTATGACAT 
TTACCGAGTT 
CGCTGGACTA 
GTCTGCGCTT 
TCGATCAGTT 
TGCGTCATCA 
ATAAGTACGA 



AATGAATCGT 
GAGGTGAGTT 
GCCTTGATTG 
GCCATTGGTT 
GCCGGGCAGT 
GGTTTCGAGG 
GTGCCGACGC 
GACGCACTAA 
CCGGGAACTA 
CGGGACATCT 
ACTCGTACCA 
GCCCTGTATG 
ATGACCAAGC 
AAGATCGTTG 
AAGCCGTTCG 
ATCGATCCCT 
GAACTGTCTG 
GGCCTGAACG 
TATAAGAAGA 
GAAGCCAAGG 
CGTGAACACC 
GACGCTGTAG 
GCCAAGCTAG 
GCTTGATCAC 
CATTTCCATA 
TCCTCGCCAT 
CAATGACTCG 
CGAGTTCTTT 
CGTATCGATC 
GGGTTGCGAC 
GGCTGTTATC 
CCAGGCGATC 
GGTCGATCTG 



GGAAGATTTA 
GGAAAATGAT 
GTATCGTGGG 
TCGATGTCTT 
GCTATATCGA 
CTACGACCGA 
CGCTGAACAA 
AACCGTATCT 
CCGAGGAAGA 
ACCTGGTCTA 
TTCCGAAAGT 
AACAGGCCAT 
TGTTGGAGAA 
CTGATCGCAT 
GTTTCACTCC 
TCTACCTGAC 
GTGAGGTCAA 
AGGCAGGCAG 
ATGTCGACGA 
GTGGGATGGT 
ACTTCGAACT 
TGCTTGCGAC 
TTGTTGACAG 
CCATCCCAGC 
GGACGAACCA 
ATGCGCGCCA 
GTCGGTATTA 
CTTGATCATG 
TGCTCGCCCA 
GTAATCTGCG 
GAGCGCGAAA 
ATCGCATTGA 
ACTTACATTA 



AGTTCCCGTT 
AGATGTTAAC 
TCTGGGTTAT 
GGGTATCGAT 
ACATATTCCG 
TTTCAGCCGT 
GTATCGCGAG 
GCGCGTAGGG 
GTTGTTGCCA 
TTCTCCGGAG 
GATCGGTGGT 
CGACCGGGTC 
CATTCATCGC 
GGGTATCGAC 
TTAC TACCC A 
TTGGAAGGCT 
CCAGGCCATG 
GGCCCTCAAG 
CATGCGCGAG 
CGCCTATAGC 
GAGCAGTGAG 
CGACCATGAC 
CCGTGGCAAG 
ATGTCCATCC 
TGAAAAATTT 
TCAAAGACAC 
TTGATAGCAT 
CGAGCAACCT 
ATTACCTGCA 
AAAAGCCGCT 
CCGATAAGCG 
AGGACAAGGT 
CTTCCCGCGG 



GTGCGGTCGC 
ACAGTGGTAG 
GTCGGTTTAC 
ATCGATGATG 
CAAGCCAAAA 
GTCAGTGAAT 
CCGGATATGA 
CAGGTGGTTT 
CGCGTGCAGG 
CGTGAAGATC 
CACACTCCTC 
GTGCCGGTCA 
GCGGTCAATA 
ATCTTTGAAG 
GGGCCGGGAC 
CGCGAATACG 
CCGGAATACG 
GGCAGTCGTG 
TCGCCATCCG 
GATCCGCATG 
CCGCTGACTG 
AAGTTTGACT 
TACCGCTCCC 
GCTCGTGCCA 
CGCTCTCATC 
CGGTAACTGC 
CTCTCCCCAG 
CAAGCGCGAC 
CTACCCGCAT 
TGTTCCAACC 
CCTCTACAAC 
CGCCCGCGAA 
CAACTGGTAT 



AGGCGCGGGC 
AGAAGTTCAA 
CACTGATGCT 
TCAAGGTTGA 
TTGCTAAGGC 
GTGATGCCCT 
GCTTTGTCAT 
CGCTGGAAAG 
AGGGTGGCCT 
CGGGCAACCC 
AGTGTCTGGA 
GTTCCACCAA 
TCGGTCTGGT 
TGGTTGATGC 
TGGGCGGGCA 
GACTGCATAC 
TACTGGGCAA 
TACTGGTATT 
TGGAAATCAT 
TGCCGGTGTT 
CCGAAAACCT 
ATGAGCTGAT 
CGGCGGCACA 
GAAGGCCGGG 
GGTGCTGCCG 
CTGGTTTCGG 
AGCGAGTTTT 
TCTGCTACCG 
ATCGCTGCAG 
CCAGAGATGC 
ATTCTGCAAC 
AAAAGTCCGC 
CTGAAAAGCT 
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GGAAGGGAGA 


TCCACGTAAG 


TCGTTCGGCG 


TGGCTACCAA 


CATCGGTGTG 


CACTTCTACG 


'3240 


ACATGCTGCA 


CTTCATCTTT 


GGCAAGCTGC 


AGCGTAATGT 


TGTGCACTTC 


ACTTCCGAGT 


3300 


ACAAGACAGC 


TGGTTATCTG 


GAGTACGAGC 


AGGCCCGTGT 


GCGTTGGTTT 


CTGTCCGTGG 


3360 


ATGCTAACGA 


CCTGCCGGAG 


TCGGTCAAGG 


GCAAAAAGCC 


GACCTATCGT 


TCGATTACCG 


3420 


TCAACGGTGA 


GGAAATGGAG 


TTCTCTGAAG 


GCTTTACCGA 


TCTACATACA 


ACCAGCTACG 


3460 


AAGAAATTCT 


CGCTGGTCGT 


GGTTATGGCA 


TCGATGACGC 


TCGTCATTGT 


GTGGAAACTG 


3540 


TC AATAC CAT 


TCGCAGCGCC 


GTCATCGTAC 


CGGCCTCTGA 


TAACGAAGGG 


CATCCGTTCG 


3600 


TCGCGGCGCT 


TGCGCGTTGA 


GGTAGAAAAG 


GAGTGGCCGT 


CCTCGGTCAC 


CTGTTTACAG 


3660 


CAGGTTTCCG 


CAGGATCATT 


CATCAGCATG 


TCATCTAGTA 


GCTCTAAATT 


GCTGAACGGT 


3720 


ATGGTCGCGG 

*■ X. ^rfX* X W Wv WW 


TAAGTTCAGG 

X 4W X X VfftWW 


CAGAAACATT 


CGGCTGGATG 


TCCAGGGGCT 


GCGGGCTGTT 


3780 


GCAGTTCTGG 


CTGTGCTAGC 

X* X W A X^^v X itWVa 


TTACCACGCC 


AACAGTGCCT 


GGCTCAGGGC 


TGGGTTTGTC 


3840 


GGCGTTGACG 


TGTTCTTCGT 

A W X X X* X X V* x 


CATTTCCGGG 


TT T ATC ATT A 


CCGCCTTACT 


GGTCGAGCGC 


3900 


GGTGTAAAAG 


TTGATCTGGT 


AGAGTTTTAC 


GCGGGCCGTA 


TCAAACGTAT 


TTTTCCAGCC 


3960 


T ATTT CGTC A 


TGTTGGCGAT 


TGTCTGCATT 


GTCTCGACAA 


TTCTGTTTCT 


GCCTGATGAC 


4020 


TATGTTTTTT 


TTGAAAAAAG 


TCTACAGTCA 


TCTGTATTTT 


TTTCCAGTAA 


TCACTATTTC 


4080 


GCTAATTTTG 


GTAGTTACTT 


TGCTCCGAGA 


GCTGAAGAGC 


TGCCGCTGCT 


GCATACTTGT 


4140 


TCAATAGCCA 


ACGAGATGCA 


GTTTTATCTG 


TTCTACCCTG 


TACTGTTCAT 


GTGCCTGCCA 


4200 


TGTCGATGGC 


GCTTGC CGGT 


GTTCATCCTA 


TTAGCTATTT 


TGCTGTTCAT 


TTGGAGTGGC 


4260 


TATTGCGTAT 


TCAGCGGCAG 


CCAAGATGCT 


CAGTACTTCG 


CCTTGCTAGC 


TCGTGTACCT 


4320 


GAGTTCATGT 


CGGGAGCTGT 

WwWW*«W V* * * 


TGTCGCATTA 


TCATTACGTG 


ATCGTGAGCT 


ACCCGCCAGG 


4380 


CTTGCGATAC 


TTGC GGGG TT 


ATTGGGGGCG 


GCGTTGCTGG 


TCTGCTCCTT 


CATTATCATC 


444G 


GACAAGCAGC 


ACTTTCCCGG 


ATTCTGGTCG 

X X V* X X 


CTCCTGCCAT 


GCCTGGGAGC 


CGCTCTGCTC 


450C 


ATTGCTGCCC 


GACGTGGCCC 


TGCCAGCCTG 


CTGC TGGCC A 


GCAGGCCCAT 


GGTCTGGATA 


4560 


GGTGGTATCT 


CC T ATTCGTT 


GTATCTGTGG 


CACTGGCCAA 


TTCTGGCATT 


CATCCGTTAC 


4620 


TACACCGGCC 


AATACGAATT 


GAGCTTCGTG 


GCGCTGTTGG 


CATTTCTCAC 


AGG TTCGTTC 


4680 


CTGCTGGCCT 


GGTTC TC AT A 


CCGCTACATC 


GAGACACCTG 


CCAGAAAGGC 


TGTGGGTCTG 


4740 


CGC C AGC AGG 


CGCTGAAGTG 


GATGTTGGCC 


GCCAGTGTGG 


TAGCTATAGT 


GGTTACGGGG 


4800 


GGGGCGCAGT 


TCAATGTGTT 


GGTTGTGGCG 


CCGGCGCCAA 


TTCAGTTGAC 


GCGCTACGCT 


4860 


GTAC CAGAGT 


CGATCTGCCA 


TGGTGTTCAG 


GTAGGGGAGT 


GCAAGCGAGG 


CAGCGTCAAT 


4920 


GCCGTACCCC 


GTGTGCTGGT 


GATCGGTGAT 


AGCCATGCTG 


CGCAGCTTAA 


CTACTTCTTC 


4980 


GACGTGGTTG 


GCAACGAGTC 


AGCTGTGGCT 


TACCGAGTAC 


TCACCGGAAG 


CAGTTGTGTG 


5040 


CCAATACCTG 


CTTTCGATCT 


TGAACGTTTG 


CCCCGTTGGG 


CGCGGAAACC 


CTGCCAAGCG 


5100 


CAGATTGATG 


CAGTTGCCCA 


ATCAATGTTG 


AACTTTGACA 


AGATCATTGT 


GGCGGGCATG 


5160 
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• -SC^ATC AGATGCAGAG TCCGGCATTT GCCCAGGCTA CC^TCGAT 
ACCACCTATG CCGGCAAGCA GGTCGCTCTA CTCGGGCAGA TACCGA1GTT CGAATCAAAC 
GTGCAGCGTG TGCGTCGTTT CAGGGAGCTG GGTTTGTCAG C^CGCTTOT TAGC'CCAGC 
TGGCAAGGTG CGAACCAGCT GTTGCGTGCT CTAGCCGAGG GTATTCCAAA CGTACGGTTC 

atggatt™ CTO cagcgc c^cttcgcc ga^ctt a^aggacgg agaccttatt 

TACCAGGATA GCCATCACCT TAACGAGGTG GGGGCTCGCC GOTA^A 
CGTCAA^C AGCGGCTGTT ^AACAACCA CAATCGAGTG TGAGTCTCAA GCCATGAGTT 

r™* i ~ c ° Ac<ra «««~ 

TTTGGCACTT CGTGCACATC TGTGCAGGTG CCCGGATTGG CGCAGGGGTT TCGTTGGGTC 
AGAACGTATT CGTCGGCAAG AAGGTCGTTA TTGGTGATCG CTGCAAGATC CAGAACAACG 
TGTCGGTATA TGACAATGTC ACTCTGGAAG AGGGCGTGTT C^CGGGCCG AGCATCGTA* r 

ZZZ rac " cccc CGCTCCTTCA TCG4ococ " ««"~ 

TCCT AAAAAA AGGTGCCACG CTTGS^CCA ACTGCACTAT CGTCTGTGGC GTOACTA^ 
CTGAATATGC CTTCCTGGGT GCGGGTGCGG T^AACAA GAATGTTCCA TG^CC 
™TGGTAGG CG TC CCCGCT CGACAGATTG GTTGGATAGC GAATTCGGTG AGCAGCTGCA 
GCTGAACGAG CAGGGCGAAG CTGTCTGCTC ACACTCCGGT GCGCGCTATG TACTCAATGG 
AAAGATCCTG AGCAAGGTGG ACGTG^CC A^A^AA, ^ 
CAAGCGCGTA TCAAGGACAA GA^GATGCC GGTATCCAGC GCGTGCTGAG ACACGGGCAG 
TACAT^ GCCCGGAAGT CACTGAGCTT GAGGATCGCC TCGCCGATTT CGTCCGCGC- 
^ACTGCA TCAG^GC CAACGGTACT GACGC^AC AGATTGTGCA GA^CC^ 

" ™- CCTGGTTTTA crr^ GACAGCGGAG 

-CGTCGGGC ^GGGAGc CMSCC<JCTT ^ 

~GC AGTTGCTGGA GGC^GATC ACACCGCGTA CGAAGGCTAT ^ 
TCGCTGTATG GCCAG^ AGACTTCGAT GCAATCAACG CCA^CC. CAAATATGGT 
ATCCCTGTCA TTGAGGATGC TGCACAGAGC TTCGGTGCTT CGTACAAGGG .AAGCG^ 
TGTAATC^ g T AC CCTTC c CTGCACCAGC TTC^CCCGA GCAAACCGTT GGG^c™ 
OGGGATGGrG GAGCGA^ CACTAACGAC GATGAACTGG C.a™ ^ 
GCCCGGCATG GTCAGGACCG CCGCTATCAT CACA^ ,GGGGG TC AA TAG.CGC^ 
GACACATTGC AGGC^CGAT TCTTCTACCG AAGCTTGAAA TTTTCGAGGA GGAGATTGCG 
TTGCGGCAGA AGGTAGCCGC GGAGTATGAC CTATCACTGA AACAGGTCGG TATCGGCACG 

ccgt™™ gaagtggata ACATCAG TCT „atgcccag ta.acgg^c gta^aa 

™~ r AGGm ctttcmagc cc ™ — C 

TATTCCGCTT AATAAGCAGC CTGC^c GGATGAGAAA GCGAAACTAC CAGTOGG^ 
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.CAAGGCTGCT ACTCAAGTAA TGAGCCTACC CATGCATCCC TATCTGGATA CGGCATCCAT 7200 

CAAAATCATC TGTGCTGCGT TGACGAATTG ACGGATGTAT "ATACTTGCTC GAGTCGACAG 7260 

GTCTATTCTG CTGAACACAG TGTTACTGTT TGCTTTCTTT TCAGCGACAG TGTGGGTGAA 7320 

TAATAATTAT ATCTATCATC TCTATGATTA TATGGGGTCT GCGAAAAAAA CTGTCGACTT 7380 

CGGCTTGTAT CCGTACTTGA TGGTCTTGGC GCTCATCTGT GCCCTGTTGT GTGGAGGGGC 7440 

AATTCGCAGG CCAGGTGATC TGTTAGTTAC ATTATTAGTT GTAATACTTG TTCCTCATTC 7500 

ATTGGTTCTT AATGGAGCTA ATCAATATTC TCCGGATGCG CAACCATGGG CTGGCGTGCC 7560 

TCTGGCAATT GCTTTTGGTA TTTTGATCAT CGGCATTGTC AATAAGATAA GATTCCATCC 7620 

GCTAGGTGCA TTGCAGCGAG AAAACCAAGG AAGGCGAATG TTAGTGCTAC TGTCAGTACT 7680 

CAACATAGTA GTGCTTGTGT TTATTTTCTT TAAAAGCGCT GGTTATTTTT CCTTTGACTT 7740 

TGCTGGGCAG TATGCTCGCC GTGCACTTGC TCGTGAGGTT TTTGCTGCGG GTTCTGCAAA 7800 

CGGCTACTTG TCGTCAATCG GTACCCAGGC ATTCTTTCCT GTGTTGTTTG CCTGGGGGGT 7 860 

CTACAGACGA CAATGGTTCT ACTTGGTCCT GGGTATTGTC AATGCACTAG TGCTGTGGGG 7920 

AGCGTTTGGA CAGAAGTATC CTTTTGTCGT GTTGTTTCTA ATTTATGGCC TGATGGTTTA 7980 

TTTTCGACGA TTCGGTCAGG TCAGAGTGTC TTGGGTTGTC TGCGCACTAT TGATGCTTTT 8040 

GCTTTTAGGG GCGTTGGAAC ATGAGGTGTT TGGCTATTCA TTCTTGAATG ATTATTTTCT 8100 

ACGTCGTGCT TTTATTGTGC CTTCCACCCT GTTGGGGGCA GTTGATCAGT TTGTGTCTCA 8160 

GTTCGGATCC AATTATTACA GGGATACCCT GTTGGGCGCG CTCTTGGGTC AGGGTAGGAC 8220 

TGAGCCGTTG AGCTTTCGTC TGGGGACGGA AATTTTCAAT AATCCCGATA TGAATGCGAA 8280 

TGTAAACTTC TTCGCGATAG CCTATATGCA GTTGGGTTAT GTGGGGGTTA TGGCTGAGTC 8340 

GATGTTGGTG GGCGGTAGTG TCGTTCTCAT GAATTTCTTA TTTTCGAGGT ATGGTGCATT 8400 

CATGGCCATT CCGGTTGCTT TGTTATTTAC TACAAAGATT CTTGAGCAGC CCCTGCTAAC 8460 

TGTAATGCTT GGCTCTGGTG TTTTCTTGAT ACTGCTTTTC CTTGCGCTAA TTTCTTTTCC 852 0 

ACTCAAGATG TCTTTAGGAA AAACTCTATG AGTGCGGCTT TTATCAACCG TGTCGCACGA 8580 

GTATTAGTAG GCACCTTGGG AGCACAGCTC ATAACGATTG GTGTCACTCT GCTACTGGTT 8640 

CGTCTGTATT CTCCTGCTGA AATGGGCGCT TTCAGTGTTT GGCTATCGTT CGCTACGATT 8700 

TTTGCAGTTG TAGTTAC TGG GCGCTATGAG TTGGCTATTT TTTCGACTCG AGAAGAGGGC 8760 

GAACTCCAGG CAATCGTCAA GCTGATACTT CAGTTGACAC TATTGATTTT CGTTGCCGTG 8820 

GCGATTGCTG TTGTTATAGG TAGACATCTG ATTGAGTCGA TGCCAGTTGT GATCGGTGAA 8880 

TACTGGTTCG CATTGGCGGT GGCTTCGCTG GGGTTGGGGA TAAATAAGCT AGTCTTGTCG 8940 

TTACTTACAT TTCAACAATC TTTTAATCGG TTGGGAGTTG CTCGTGTAAG CCTGGCTGCA 9000 

TGTATTGCCG TTGCACAAGT TTCAGCTGCA TATTTACTGG AGGGCGTATC AGGGCTGATC 9060 

TATGGCCAGC TGTTTGGTGT CGTCGTAGCC ACGGCGCTTG CGGCCCTTTG GGTAGGAAAG 912 0 
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.TCGCTGATTT TAAATTGTAT CGAGACACCG TGGCGTATGG TACGACAAGT AGCGGTACAG 
TACATCAATT TCCCGAAGTT TTCTCTGCCT GCGGATCTGG TCAACACGGT TGCCAGTCAC 
GTGCCTGTGA TTTTATTGGC GGCAAAGTTT GGTGGAGACA GTGCAGGCTG GTTTGCCCTG 
ACTCTGAAGA TAATGGGAGC TCCCATTTCC TTGTTGGCTG CTTCGGTGCT CGATGTGTTC 
AAAGAACAAG CCGCTCGTGA CTACCGAGAG TTTGGTAATT GCCGAGGTAT CTTCCTCAAG 
ACTTTCAGGT TGCTTGCCGT CCTCGCGCTA CCTCCTTTTA TTATATTTGG TTCATTGGCG 
AGTGGGCCTT TGGGTTAGTC TTTGGCGAAG CGTGGGCTGA GTCGGGGCGT TA TC CTGTA~ 
TGATGGTTCC GTTGTTTTAT ATGCGTTTCG TGGTGAGTCC GCTCAGCTAT ACAATCTATA 
TTGCCCAGCG GCAGAGTATG GATTTGTTGT GGCAGCTAGC CTTGT^CTC CTGACGTTA 
TCTGTTTTAC CTTGCCTGAC TCTGTCGACT CGGTGTTGTG GTTTTAC TCC ATAGCATATG 
CTGTTATGTA TTTTGTCTAT TTCTGGATGT CCTTCCAGTG TGCCAAGGGA GATGCCAAGT 
GATCGTTGTT ATTGATTACG GTGTAGGTAA CATTGCTTCA GTCTTGAACA TGCTGAAGCG 
AGTTGGTGCC AAAGCCAAGG CATCCGATAG CCGAGAGGAT ATCGAGCAGG CGGAGAAACT 
GATTTTGCCT GGTGTCGGTG CTTTTGACGC CGGAATGCAA ACACTACGCA AGAGTGGGCT 
GGTGGATGTA CTGACAGAGC AGGTCATGAT CAAACGAAAG CCGGTCATGG GGGTGTGTCT 
CGGGAGTCAA GATGCTGGGG CTGCGATCTG AGGAGGGAGC GGAACCGGGG CTTGGATGGA 
TCGATATGGA TAGCGTCCGT . TTCGAAAGGC GTGACGACCG AAAGGTTCCA CATATGGGCT 
GGAATCAAGT GTCCCCGCAA TTGGAGCATC CTATACTTAG CGGTATAAAC GAGCAAAGCC 
GATTCTATTT TGTTCATAGT TATTATATGG TTCCGAAAGA CCCAGACGAT ATCCTGTTGA 
G TTGT AATTA TGGACAAAAA TTCACTGCGG CGGTGGCTCG GGATAATGTT TTCGGAT rB TC 
AGTTTCATCC TGAGAAGAGT CATAAATTCG GTATGCAGTT ATTCAAAAAC TTCGTGGAGC 
TTGTCTGATG GTCCGGAGGC GCGTTATCCC ATGCTTGCTG CTCAAGGATC GCGGTCTAGT 
GAAAACCGTG AAGTTCAAGG AGCCCAAGTA CGTTGGAGAC .CCGATCAACG CAATACGCAT 
CTTCAATGAG AAAGAAGTCG ACGAACTGAT TTTGCTGGAT ATAGATGCTT CCAGGCTCAA 
TCAAGAGCCT AACTATGAGT TGATCGCGGA AGTGGCTGGT GAGTGTTTTA TGCCTATTTG 
CTATGGGGGC GGTATCAAGA CATTGGAGCA TGCGGAAAAA ATCTTTTCCC TAGG^CGA 
AAAAGTTTCG ATAAATACCG CCGCTCWAT GGATCTTTCG TTGATTCGAA GAATTGCCGA 
TAAGTTTGGT TCGCAAAGCG TAGTTGGCTC TATCGACTGC CGCAAGGGTT TCTGGGGAGG 
ACACTCCGTG TTCTCAGAGA ATGGGACGCG CGACATGAAA CGCTCCCCAT TGGAGTGGGC 
GCAAGCGCTC GAAGAGGCTG GAGTGGGTGA GATTTTTCTA AATTCTATTG ATCGAGATGG 
AGTGCAGAAA GGCTTCGACA ACGCTCTAGT GGAAAATATC GCTTCTAACG TCCATG^GCC 
AGTGATCGCC TGTGGTGGAG CTGGCTCCAT CGCTCACCTC ATCGATCTTT TTGAGCGTAC 
GTGTGTGTCG GCAGTAGCGG CGGGAAGCCT ATTCGTTTTC CATGGCAAGC ATCGTGCGCT 
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ACTGATTAGT 


TATCCGGATG 


TCAACAAGCT 


CGACGTCGGT 


TAGAGTGAGC 


TGAGTTATTT 


11160 


ATGGCAAGGA 


CGCTTGTTGG 


CAACGCTATA 


TGCGCTTCAA 


GATTGTCGAA 


CTAAATTTGA 


11220 


GTTTGTCAGT 


GGGGCGTTCC 


ATTAGGCAGG 


CCGAGGTGAG 


TGCTTCGGGA 


GGTTG TTGTG 


11280 


ATGAAGATCT 


GTTCGCGCTG 


TGTTATGGAT 


ACATCTGACG 


CTGAAATCGT 


ATTTGATGAG 


11340 


GCGGGAGTCT 


GTAATCACTG 


CCATAAATTT 


GACAATGTTC 


AGTCCCGGCA 


GCTGTTTTCC 


11400 


GATGCTAGTG 


GTGAGCAGCG 


CCTTCAAAAG 


ATAATTGGGC 


AGATCAAGAA 


GGACGGTTCA 


11460 


GGTAAGGATT 


ATGACTGCAT 


CATTGGCCTT 


AGTGGCGGCG 


TAGATAGTTC 


CTATCTTGCT 


11520 


GTAAAGGTCA 


AGGATCTTGG 


CTTGCGCCCA 


CTGGTTGTGC 


ATGTGGACGC 


CGGCTGGAAT 


11580 


AGCGAACTTG 


CAGTCAGTAA 


TATTGAAAAG 


ATTGTAAAAT 


ATTGCGGTTT 


TGATTTACAT 


11640 


ACTCATGTAA 


TAAACTGGGA 


GGAAATTCGT 


GATCTTCAGT 


TGGCTTATAT 


GAAAGCTGCT 


11700 


GTCGCCAATC 


AGGATGTGCC 


TCAAGATCAT 


GCCTTCTTCG 


CTAGTATGTA 


TCACTTTGCT 


11760 


GTGAAGAATA 


ATATTAAGTA 


CATTCTGAGT 


GGTGGTAATT 


TGGCCACTGA 


GGCAGTATTC 


11820 


CCAGATACAT 


GGCACGGCAG 


CGCTATGGAT 


GCAATAAACC 


TAAAGGCTAT 


TCACAAAAAA 


11B80 


TATGGTGAGC 


GTCCGCTAAG 


GGACTACAAG 


ACTATTAGTT 


TTCTTGAGTA 


CTATTTCTGG 


11940 


TATCCCTTTG 


TCAAAGGAAT 


GAGAACGGTC 


CGTCCGTTGA 


ATTTCATGGC 


CTATGATAAG 


12000 


GCCAAGGCTG 


AAACCTTCCT 


TCAAGAAACG 


ATAGGCTATC 


GTTCTTACGC 


GCGAAAGCAT 


12060 


GGAGAGTCGA 


TTTTCACCAA 


GCTTTTCCAG 


AACTACTATC 


TACCGACCAA 


GTTTGGCTAT 


12120 


GATAAACGCA 


AACTGCACTA 


CTCCAGCATG 


ATTTTGTCTG 


GGCAAATGAC 


GCGTGACGAA 


12180 


GCTCAGGCTA 


AACTGGCTGA 


GCCGCTATAT 


GATGCAGATG 


AACTGCAGTT 


TGATATCGAA 


12240 


TATTTCTGCA 


AGAAGATGCG 


AATCACCCAG 


GCTCAATTTG 


AAGAGTTGAT 


GAATGCACCT 


12300 


GTTCATGACT 


ATTCGGAGTT 


TGCCAACTGG 


GATTCTCGAC 


AGAGGATTGC 


GAAAAAAGTT 


12360 


CAAATGATTG 


TCCAGCGTGC 


GCTGGGTCGT 


CGCATCAATG 


TCTACTCGTG 


ATGACCGGGG 


12420 


CCGCTCATGA 


CTAAAGTTGC 


TCATTTGACA 


TCGGTTCACT 


CGCGTTATGA 


TATTCGTATA 


12480 


TTTCGAAAGC 


AGTGTAGAAC 


ACTCTCTCAA 


TACGGATACG 


ATGTGTATCT 


GGTTGTCGCA 


12540 


GATGGTAAGG 


GTGATGAAGT 


CAAGGATGGT 


GTAAGGATTG 


TTGATGTCGG 


AGTACTCTCA 


12600 


GGTCGCTTGA 


ATCGTATTCT 


AAAAACCACC 


CGAAAAATTT 


ATGAACAGGC 


TTTGGCGCTT 


12660 


GGGGCTGATG 


TCTATCATTT 


TCATGATCCC 


GAACTGATAC 


CTGTTGGTCT 


TCGACTGAAA 


12720 


AAGCAAGGTA 


AGCAGGTTAT 


CTTCGACTCC 


CATGAGGATG 


TGCCGAAGCA 


ACTGCTGAGT 


12780 


AAACCTTACA 


TGCGACCGTT 


TTTACGCCGT 


GTAGTGGCTG 


TGTTATTTTC 


CTGCTATGAG 


12840 


AAATATGCAT 


GCCCTAAGCT 


GGATGCAGTC 


CTTACGGCAA 


CGCCGCATAT 


TCGTGAAAAA 


12900 


TTTAAAAATA 


TTAATGGGAA 


TGTTCTAGAT 


ATTAATAACT 


TTCCCATGTT 


GGGTGAGTTG 


12960 


GATGCGATGG 


TTCCTTGGGC 


AAGCAAGAAA 


ACTGAAGTCT 


GCTACGTCGG 


TGGTATCACT 


13020 


TCCATTCGTG 


GTGTTCGTGA 


AGTCGTTAAG 


AGTCTTGAGT 


GCTTGAAGTC 


CTCGGCGCGC 


13080 
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.TTGAATTTAG 
GGATGGAACT 
GGTGACTCTG 
CCTAATAAGA 
CTCTGGCGGG 
GCTGCCATTG 
GGACGTAATG 
CTAGCGCGGT 
TCATTGGTGC 
AGCAGACCCT 
AAATATTTTT 
GTACTCACGG 
AGAAACCTCA 
CTGCCTCCAA 
TGCGGATGCC 
GCCCTACTCG 
AGATAGTCAA 
CCTCGCCAAT 
AGAACACCGA 
TTAATGTTGC 
TAGGGCTCAA 
TGTTGCAACG 
TCTTCGGCAA 
GTGGAGCCAA 
GCCTGGGAAA 
GATTGCTGAA 
ATTTAGTTCC 
TTATTGGCGG 
AATCTCGGCA 
TGTAAATGGA 
CAGAATGCTA 
CCTGAAGCGT 
TAGC TGTTGG 



TGGGAAAGTT 
CCGTTAACGA 
TTGCCGGGTT 
TGTTCGAGTA 
AAATTGTTGA 
CTGAAGCGAT 
GCCAGCGGGC 
TCTATTCCGA 
GCGTCCGCAG 
TTCGGAAATC 
CGAACAGCTG 
CCAAATGACC 
TCGCGTATTG 
GCTGCATGTT 
GGAGGAAATT 
AGTTGCAATT 
CGTGGGTGAT 
TGGACTTGCG 
CGATCCAGTT 
ACCTGTGGTG 
GCTGGAAGTG 
CTCTGGCCTG 
GCCCTGCGTG 
CGTTCTTGTG 
GACCATTCAA 
TATCTTGCCA 
ATGAACGTCT 
CCTTATTATT 
GGCTATCACC 
GCCGAATACG 
TCGATGCTCA 
GGAACGCCGG 
CTGGCTGCTC 



TTCAGAGCCA 
ACATGGTCAG 
GGTGACATTT 
TATGTCGTCG 
AGGTAGCAAT 
CGACTATCTG 
AGTGAACGAA 
TCTACTGAGT 
TTTATTAAAG 
ATCGTTCATA 
GGTATTCCAA 
GGGCGTATGC 
GTATACGGCG 
CCTATCGCAC 
AACCGTATTC 
GATAATCTCA 
GTGATGCAGG 
TCACAAGATG 
CGCCTGACTT 
CTACCCCTGC 
CAGGTTATCG 
GTGCTCACGG 
ACCATGCGTG 
GGAGCGGCCC 
GACGATGGTC 
AGCTGTGATG 
GGTATGTGCA 
TCTCCAAGTT 
ATCTGCTGGA 
CATATGTACC 
TATTTACCAT 
ATGCGATTAT 
GCCTGCTAGG 



GAGATAGAAA 
CTTGATCGAG 
CTCCCAATGC 
GGAATCCCTG 
TGTGGTATAT 
GTAAGTAATC 
CGTTATAACT 
AAGCGAGATT 
CGAGTGTGGT 
CTGGTCAGCA 
AGCCGGATTA 
TAATGGAGAT 
AT AC C AACTC 
ACATCGAAGC 
TTACTGATCA 
AGAATGAAGG 
ATAGCGCTCT 
GGTTTATTCT 
CGATAGTCGA 
ATCCACGTAC 
ATCCTGTCGG 
ACAGCGGCGG 
ACCAGACCGA 
GCGACATGAT 
AGCTTTACGG 
CTTTGCGTGT 
TCCCTATGCT 
TTGGAATCAG 
ACCGGATGAA 
TACTTTGCGC 
GATGTTGCTG 
CTACTCATCG 
TGCGAAATTT 



AAGAAGTCAG 
AAGATGTTCG 
CTAATCATGT 
TGATCGCTTC 
GCGTAGATCC 
CGTGTGAGGC 
GGGATTTGGA 
CCATATGAAA 
TTCAAAGGCT 
TTTTGATGCC 
CCAGTTGGAT 
CGAGGATGTA 
TACCTTGGCT 
CGGCCTGCGA 
GGTTAGTGAT 
TTTCGAAAGA 
ATTC TTTGCG 
CGCGACCCTG 
GGCTCTGAAT 
CCGCGGTGTC 
ATATCTGGAA 
TGTTCAGAAA 
ATGGGTGGAG 
TGTCGAATCT 
AGGCGGTCAA 
CGAGTTTAAA 
GGCGGCCCCG 
GCTGGGCATC 
AAGCGTTCGG 
TATTTGGGCA 
CCATTCTGCC 
CCTCACCCGT 
GTATTTGAGG 



AGCGCTCAAG 
TCGTGTACTC 
TGATGCACAA 
CAATTTTCCT 
TCTAAGTCCT 
GGCAGCGCTG 
AGGGCGCAAA 
ATTCTGACCA 
ATCATTGAGC 
AATATGTCTG 
ATCCATGGTG 
ATTCTCAAGG 
GGAGCGTTGG 
AGTTTCAATA 
ATTCTGTTTT 
AAGGCTGCGA 
CAGCGTGCAA 
CATCGTGCCG 
GAAATCCAGA 
ATCGAGCGCC 
ATGATCTGGC 
GAAGCATTCT 
CTAGTGACCT 
GCACGGACTA 
GCCTCTCTCG 
TAAAGGATTT 
GAGTTGGTCG 
GGTCGGTCAT 
GCGTCACCTG 
ATGGCGTGGG 
TGATCTTGGC 
TTGGCGTCGT 
TGCGCGATAT 



13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 
14700 
14760 
14820 
14880 
14940 
15000 
15060 
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. CTGGCCTTTG 


AGTCTGGTCG 


AACTGGGAGG 


CTTGAAAGCT 


GACAATCCCC 


TGGTGCGTGT 


15120 


TAC CGGTTGG 


ATCGAAAGAT 


TCTCCTATGC 


GCGAGCTGAT 


AAGATCATCA 


GTCTGCTGCC 


"isiso 


ATGTGCGGAG 


CCGCACATGG 


CCGACAAAGG 


ACTTCCCGCT 


GGAAAGTTCC 


TGTGGGTTCC 


15240 


GAATGGCGTT 


GACAGCAGCG 


ATATCTCTCC 


TGATAGCGCT 


GTGAGTTCAA 


GTGATTTGGT 


15300 


CCGGCATGTA 


CAAGTTCTCA 


AGGAGCAGGG 


TGTTTTCGTT 


GTGATCTATG 


CTGGAGCGCA 


15360 


CGGCGAACCC 


AATGCTCTGG 


AGGGATTGGT 


TCGCTCTGCC 


GGACTGCTGC 


GCGAGCGTGG 


15420 


TGCAAGTATC 


AGAATCATTC 


TGGTGGGCAA 


GGGAGAGTGC 


AAAGAGCAAC 


TCAAGGCGAT 


15480 


TGCCGCACAG 


GATGCCAGCG 


GGCTAGTGGA 


GTTTTTCGAT 


CAGCAGCCCA 


AAGAGACTAT 


15540 


CATGGCTGTC 


CTGAAGCTGG 


CGTCGGCGGG 


CTACATCTCG 


CTCAAGTCAG 


AACCGATCTT 


15600 


CCGCTTTGGC 


GTGAGCCCCA 


ACAAGCTATG 


GGATTACATG 


CTGGTTGGGT 


TGCCAGTCAT 


15660 


TTTCGCCTGC 


AAGGCAGGGA 


ACGACCCGGT 


TAGTGACTAC 


GATTGCGGTG 


TATCTGCCGA 


15720 


CCCAGATGCC 


CCTGAGGATA 


TTACTGCAGC 


CATCTTCCGT 


CTGTTGCTGC 


TGAGCGAAGA 


15780 


CGAGCGTCGC 


ACAATGGGGC 


AAAGAGGGCG 


TGATGCGGTC 


CTGGAGCATT 


ATACCTACGA 


15840 


GAGTCTGGCT 


CTTCAGGTGT 


TGAACGCCCT 


TGCTGATGGG 


CGCGCAGCAT 


GAAAGCTGTC 


15900 


ATGGTGACCG 


GTGCATCAGG 


ATTCGTCGGA 


TCGGCCTTGT 


GCTGTGAGCT 


TGCTCGGACA 


15960 


GGGTATGCGG 


TGATTGCGGT 


GGTACGGCGG 


GTTGTTGAAA 


GAATACCTTC 


TGTGACGTAC 


16020 


ATCGAAGCTG 


ATCTGACCGA 


TCCAGCCACG 


TTTGCCGGCG 


AGTTCCCGAC 


GGTGGATTGC 


16080 


ATTATTCATC 


TCGCTGGACG 


TGCCCATATA 


CTCACTGACA 


AGGTTGCAGA 


CCCGCTCGCC 


16140 


GCATTTCGTG 


AAGTCAACCG 


AGATGCGACT 


GTCCGGTTGG 


CTACCCGTGC 


GCTCGAGGCT 


16200 


GGGGTGAAGC 


GTTTCGTGTT 


TGTCAGTTCA 


ATTGGCGTTA 


ACGGTAACAG 


CACCCGGCAA 


16260 


CAGGCTTTCA 


ACGAAGATTC 


TCCAGCCGGC 


CCACATGCGC 


CCTATGCCAT 


CTCCAAATAC 


16320 


GAGGCTGAGC 


AGGAGCTGGG 


GACTTTGCTC 


CGGGGTAAAG 


GTATGGAGTT 


GGTGGTTGTC 


16380 


CGACCGCCTT 


TGATCTATGC 


CAATGATGCG 


CCAGGTAACT 


TCGGCCGTT? 


GCTCAAGCTC 


16440 


GTCGCTAGTG 


GTCTGCCGCT 


TCCGCTTGAC 


GGTGTCCGTA 


ATGCGCGCAG 


CCTGGTTTCT 


16500 


AGGAGAAACA 


TCGTGGGTTT 


CCTGAGTCTT 


TGTGCCGAAC 


ACCCCGATGC 


TGCGGGCGAA 


16560 


CTGTTTCTGG 


TGGCGGATGG 


CGAGGATGTT 


TCCATTGCGC 


AAATGATCGA 


GGCCCTGAGT 


16620 


CGGGGAATGG 


GCAGGCGTCC 


AGCTCTTTTC 


ACGTTTCCAG 


CGGTGCTGCT 


GAAGCTTGTA 


16680 










GTGGCTCGTT 


ACAGGTCGAT 


16740 


GCTTCCAAGG 


CCCGCCGGCT 


GCTCGGCTGG 


GTTCCCGTCG 


AGACTATTGG 


TGCCGGTCTG 


168C0 


CAAGCAGCAG 


GTCGAGAGTA 


CATTCTTCGC 


CAGAGGGAGC 


GCCGAAAATG 


ACGGACACAT 


16860 


CCAAACCCCT 


GGTCGGCAAT 


TACGCTGAAC 


TTTAATAAGT 


TCTCTTTCCA 


ATGATGATCT 


16920 


GGATGATCGC 


GTGTCTAGTT 


GTCTTGCTGT 


TTTCATTTGT 


CGCTACCTGG 


GGGCTGCGTC 


16980 


GCTATGCATT 


AGCGACGAAA 


CTGATGGATG 


TTCCGAATGC 


CCGTAGCTCC 


CACAGTCAAC 


17040 
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CGACGCCTAG GGGGGGAGGT GTTGCAATCG TTCTGGTCTT CCTTGCAGCG TTGG TG TGG A 
TGCTGAGTGC AGGCAGTATC TCCGGCGGCT GGGGGGGGGC GATGCTGGGT GCAGGTTCTG 
GCGTGGCACT GTTAGGGTTC CTGGATGACC ATGGGCACAT TGCTGCGCGT TGGCGGCTGC 
TCGGCCATTT CTCAGCAGCG ATATGGATCT TGCTGTGGAC GGGTGGTTTC CCGCCGCTGG 
ATGTGGTTGG GCATGCTGTC GACTTAGGAT GGCTGGGCCA CGTATTGGCA GTTTTCTATT 
TGGTATGGGT GCTGAACCTT TATAACTTCA TGGATGGCAT TGATGGTATT GCCAGTGTCG 
AGGCCATTGG TGTCTGTGTA GGAGGGGCCC TGATCTACTG GCTTACAGGG CATGTCGCGA 
TGGTTGGTAT CCCTCTGTTG CTGGCGTGCG CGGTCGCCGG CTTCCTGATC TGGAACTTCC 
CTCCAGCTCG AATCTTCATG GGTGATGCGG GGAGTGGTTT TCTTGGTATG GTTATGGTG 
CACTAGCTAT TCAGGCTGCA TGGACCGCCC CCTCGCTGTT CTGGTGCTG G TTGATATTGC 
TGGGAGTGTT CATCGTTGAT GCAACCTATA CTCTGATCCG CCGGATCGCC AGAGGGGAGA 
AATTCTATGA GGCGCATCGC AGCCACGCTT ATCAGTTTGC CTCGCGTCGT TATGCTAGCC 
ATCTGCGGGT TACCTTGGGT GTTCTGGCTA TCAACACTCT TTGG TTGTTG CGTTGGCACT 
GATGGTTGCA TTGGGTTGGA TCAGCGGCTT CATCGGTATC CTGGTTGCTT ATGCTCCTCT 
TTGCCTCTTG GCGGTAGGAT ACAAGGCGGG TTCCTTGGAA AAATCCTAAG CCGTGGATTG 
ACCTGCTCCC CGATTTCAGT ACCACGCCGA ACTTAGTAGA GTCTGTTTTC CGAGCAGGAG 
ACGGCAGTGA AAAAGCGTTT TACTGAAGAA CAGATTCTAG ACTTTCTGAA GCAGGCAGAA 
GCCGGTGTGC CGGTGAAGGA GCTGTGTCGC CGACACAGCT TCAGTGATGC CACG^CTAC 
ACCTAGCGGG CCAAGTTCGT CGGCATGACC GTGCCGGATG CCAAGCGCCT GAAGGATCTC 
GAACTGGAAA ACAGCCGGCT GAAGAAGTTG CTCGCCGAGT CCCTCCTCGA CATCGGGGCG 
CTGAAAGTGG TCACCCGGGG AAAGGGGGAG CCCGGCAGCG GGGCGGGGGG GCAGGAGATT 
CAGGCGCAAA CCGACATCTC CGAGCGTCGT GCCCTGTCAG TTGTTCAGGC TGTCCCGCTC 
TGTGTTGTGC CACCAGCCGC GAACTAGTGT GCAAAACACC GAGCTGCAAG CCCAACTGGT 
GGAACTGGCA AGGGCTTCGG CACTTTGGCT ATCACCGCCT GCACATTCTG CTGCGGCGTG 
CTGGTGTGCA GATCAACTAC AAGCGGACTT ACCGGCTATA CTGAGCCGTC GGCTTGATGG 
TGAAGCGGCG GAGGCGCCGC CACAGGGGCG CGGTGGCGTG CGAATGCCTG AGCCTGCCGA 
GCGCACCGAA CTAGGTCTTG TCGATGGATT TCGTCTTCGA CGCGCTCAGC ACTGGGCGAC 
GGATCAAATG CCTGACGGTG G TCGATGACT TCACCAAGGA GTCGGTTGGC ATCCTGGTGG 
AGCACGGTAT CAGCGGTTTT CGTGTCACAC GGGCGCTGGA CAGATGGCAC GGTTGCGCGG 
TTACCCGAAG GCGATCCGCA CCCCCGAGrr CACCGGCAAG GCGCTTGATC AGTGGGCCTA 
TCGGCGTGAT ATTAAGTTGA AGCTGACTCA GTCCGGCAAG CCCACGCAGA ACGCCTTCAT 
CGTCATTCCA ACGGCAAGTT CCGCAATGAG CACTGCTGCT CGCTGGTCGA AGCCAGAATC 
CGCATCGTGG CCTGGCGGCA CGATTACAAC GAGCACCGAC CGTCCAGCGC CATTGGCAAT 
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CTCACCTCGC 


TAGAGTTTGC 


TGCAAGTTGG 


CGAACTCGCC 


AGCAGCAACT 


GAAGCAGGAA 


19080 


AATTGATGTC 


AACCCCAGGG 


CCTACTACCT 


AGGCAGCGTA 


CTAAAACTGG 


GGGCAGGTCA 


19140 


TCTACGATCC 


TTGTGATAGG 


TATCGACGGT 


GCTGTGGCGA 


TCCGTGCATG 


TGGAACTGAT 


19200 


CTGGGATTTT 


CCCTGCGTGT 


GTTTTCAGGG 


GCCTGGCAGT 


GATTTTTTGA 


GCATTGCCAT 


19260 


GGGGGGGCGG 


GTTTTTGCAT 


CCTGCTCGGA 


CGCTGGCTGA 


TTCCCACTCG 


ACGTGCTCGT 


19320 


GTTCGATGTC 


ACTTTTACTT 


TGCTGCTGCA 


TCGTTTGTTA 


TGAGGCGATA 


AAATTCGGCA 


19380 


GAGCTATCGA 


GTCACGCATG 


ATGGCACGTT 


GGTGTCGTGC 


TGAAGTGGCA 


TTTGCCGGTT 


19440 


ATCC TTTGTG 


GCTGTGATCA 


GTTTCTTCTG 


GTTATTACCC 


TAGCATTGCT 


GGTAGTACTA 


19500 


AGCATTATCG 


ACGGAGTACT 


TGGGGGCTTA 


TCGCGTATGC 


TCCTATGGCT 


TGGATGGCGA 


19560 


CG AG TCTTGG 


GAGGGGATGT 


CCTGAGACGT 


AGCGTGGGCC 


TTGCCATATT 


GTTGCCATGG 


19620 


TTATCTGTCT 


GATCTGTCTG 


GTTGGTATGG 


ATGTATTGAA 


CGGGGCTGAT 


AAATAGGATG 


19680 


TTGGATAATT 


TGAGGATAAA 


GCTCCTGGGA 


TTGCCGCGCC 


GCTATAAGCG 


AATGCTGCAA 


19740 


GTCGCTGCCG 


ATGTGACTCT 


TGTGTGGCTA 


TCCCTCTGGC 


TGGCTTTCTT 


GGTCAGGTTG 


19800 


GGCACAGAAG 


ACATGATCAG 


CCCGTTTAGC 


GGCCATGCCT 


GGCTGTTCAT 


CGCCGCCCCG 


19860 


TTGGTGGCCA 


TTCCCCTGTT 


CATCCGCTTC 


GGCATGTACC 


GGGCGGTGAT 


GCGCTACCTG 


19920 


GGCAACGACG 


CCCTTATCGC 


GATCGCCAAG 


GCCGTCACCA 


TTTCCGCGCT 


GGTCCTGTCG 


19980 


TTGCTGGTCT 


ACTGGTACCG 


CTCCCCGCCG 


GCGGTGGTGC 


CGCGTTCCCT 


GGTGTTCAAC 


20040 


TACTGGTGGT 


TGAGCATGCT 


GCTGATCGGC 


GGCTTGCGTC 


TGGCCATGCG 


CCAGTATTTC 


20100 


ATGGGAGACT 


GGTACTCTGC 


TGTGCAGTCG 


GTACCATTTC 


TCAACCGCCA 


GGATGGCCTG 


20160 


CCCAGGGTGG 


CTATCTATGG 


CG CGGGGGCG 


GCCGCCAACC 


AGTTGGTTGC 


GGCATTGCGT 


20220 


CTCGGTCGGG 


CGATGCGTCC 


GGTGGCGTTC 


ATCGATGATG 


ACAAGCAGAT 


CGCCAACCGG 


20280 


GTCATCGCCG 


GTCTGCGGGT 


CTATACCGCC 


AAGCATATCC 


GCCAGATGAT 


CGACGAGACG 


20340 


GGCGCGCAGG 


AGGTTCTCCT 


GGCGATTCCT 


TCCGCCACTC 


GGGCCCGGCG 


CCGAGAGATT 


20400 


CTCGAGTCCC 


TGGAGCCGTT 


CCCGCTGCAC 


GTGCGCAGCA 


TGCCCGGCTT 


CATGGACCTG 


20460 


ACCAGCGGCC 


GGGTCAAGGT 


GGACGACCTG 


CAGGAGGTGG 


ACATCGCTGA 


CCTGCTGGGG 


20520 


CGCGACAGCG 


TCGCACCGCG 


CAAGGAGCTG 


CTGGAACGTT 


GCATCCGCGG 


TCAGGTGGTG 


20580 


ATGGTGACCG 


GGGCGGGCGG 


CTCTATCGGT 


TCGGAACTCT 


GTCGGCAGAT 


CATGAGTTGT 


20640 


TCGCCTAGCG 


TGCTGATCCT 


GTTCGAGCAC 


AGCGAATACA 


ACCTCTATAG 


CATCCATCAG 


20700 


GAACTGGAGC 


GTCGGATCAA 


GCGCGAGTCG 


CTTTC GGTGA 


ACCTGTTGCC 


GATCCTCGGT 


20760 


TCGGTGCGCA 


ATCCCGAGCG 


CCTGGTGGAC 


GTGATGCGTA 


CCTGGAAGGT 


CAATACCGTC 


20820 


TACCATGCGG 


CGGCCTACAA 


GCATGTGCCG 


ATCGTCGAGC 


ACAACATCGC 


CGAGGGCGTT 


20880 


CTCAACAACG 


TGATAGGCAC 


CTTGCATGCG 


GTGCAGGCCG 


CGGTGCAGGT 


CGGCGTGCAG 


20940 


AACTTCGTGC 


TGATTTCCAC 


CGACAAGGCG 


1 GTGCGACCGA 


CCAATGTGAT 


GGGCAGCACC 


21000 



WO 97/41234 



PCT/CA97/00295 



- 103 



21780 
21840 
21900 



. AAGCGCCTGG CGGAGATGGT CCTTCAGGCG CTCAGCAACG AATCGGCACC GTTGCTGTTC 21060 
GGCGATCGGA AGGACGTGCA TCACGTCAAC AAGACCCGTT TCACAA7GGT CCGCTTCGGC 21120 
AACGTCCTCG GTTCGTCCGG TTCGGTCATT CCGCTGTTCC GCGAGCAGAT CAAGCGCGGC 21180 
GGCCCGGTGA CGGTCACCCA CCCGAGCATC ACCCGTTACT TCATGACCAT TCCCGAGGCA 21240 
GCGCAGTTGG TCATCCAGGC CGGTTCGATG GGGCAGGGCG GAGATGTATT CGTGCTGGAC 21300 
ATGGGGCCGC CGGTGAAGAT CCTGGAGCTC GCCGAGAAGA TGATCCACCT GTCCGGCCTG 21360 
AGCGTGCGTT CCGAGCGTTC GCCCCATGGT GACATCGCCA TCGAGTTCAG TGGCCTGCGT 21420 
CCTGGCGAGA AGCTCTACGA AGAGCTGCTG ATCGGTGACA ACGTGAATCC CACCGACCAT 21480 
CCGATGATCA TGCGGGCCAA CGAGGAACAC CTGAGCTGGG AGGCCTTCAA GGTCGTGCTG 21540 
GAGCAGTTGC TGGCCGCCGT GGAGAAGGAC GACTACTCGC GGGTTCGCCA GTTGCTGCGG 21600 
GAAACCGTCA GCGGCTATGC GCCTGACGGT GAAATCGTCG ACTGGATCTA TCGCCAGAGG 21660 
CGGCGAGAAC CCTGAGTCAT CGTTCTCCGG AAAAGGCCGC CTAGCGGCCT TTTTTGTTTT 21720 
CTCCGTACGA TGTTTCCGGT GCCGGACCAG GAAGCGACTG CTTTGCTGGG GCTGTCGATC 
CAGGTGCGTT CCACGGCGAT AAGGTGGTTT CGTGGATGGG CATGAAGCCC TCTACGTGGT 
CATTCATCTC TGAAGGAGTG CACCCATGCA CCTAATCAAA TCCGCTCTGC TTCTCATCCT 
GTTCGCCTGT CTTCCGTTTT CGGCTTCCGC CGCACCGGTC GCCGTCGCCA AGAATCCGCT 21960 
GGCCGCAACG ACACCTGCGA CGACCGTGTC GCCGGGGGAG CAGGTCAATA TCAATACGGT 22020 
CGACGAGGCC GCCCTGATAC GGGGGCTCAA CGGTGTCGGC GAGGCCAAGG CCACGGCGAT 22080 
CCTCGAGTAT CGTGCGGCCC ATGGTCCGTT CGTCTCGGTG GATCAACTGC TGGAAGTGAA 
AGGGGTAGGC CCGGCGTTGC TGGAGAAGAA CCGGGCGCGG ATCGTCATCG AGTGAGGTGC 
GACTGAAGGG GCGAACTTTC GTCCCGATAA CGAAAAAGCC CCCGGCATGT GCCGAGGGCT 
TTGAATTTGG CTCCGCGACC TGGACTCGAA CCAGGGACCC AATGATTAAC AGTCATTTGC 
TCTACCGACT GAGCTATCGC GGAACAGCGA GGCGTATGTT ACTGATTAAA AAGGGGAAGC 
CTCTCCCGAT GACTTCCCCA TTT TCCCTAC AGGACCTGGA CGATGGCCTT GGTGATGGTC 
TCCAGGTTCG ATTTGTTCAG CGCGGCGACG CAGATACGGC CGGTGCTGAC GGCGTAGATA 
CCGAACTCGG TCTTCAGGCG CTCGACCTGG TCGGCGGTCA GGCCGGAATA GGAGAACATG 
CCACGTTGGC GACCGACGAA ACTGAAGTCG CGCTTGGCGC CGTGGGCTGC CAGTTGCTCG 
ACCATCGCCA GGCGCATGTC GCGGATGCGG TCGCGCATCT CGCCCAGTTC CTGCTCCCAG 
AGGGCCCGCA GTTCCGGGCT GTTGAGCACG GAGGAGACGA CGCTGGCGCC GTGGGTCGGT 
GGGTTCGAAT AGTTGGTGCG GATCACCCGC TTCACCTGGG ACAGCACGCG GGCCGATTCA 
TCGCGGCTTT CGGTCACGAT CGAGAGGGCG CCGACGCGTT CGCCATAGAG CGAGAAGGAT 
TTGGAGAACG AGCTGGAAAC GAAGAAGCTC AGGCCCGACT GGGCGAACAG GCGCACCGCG 
GCGGCGTCTT CCTCGATGCC GTTGCCGAAG CCCTGGTAGG CGATGTCGAG GAACGG C ACG 



22140 

22200 

22260 

22320 

22380 

22440 

22500 

22560 

22620 

22680 

22740 

22800 

22860 

22920 
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TGGCCCTTGG 


CCTTGAGCAC 


GTCCAGCACC 


TGTTTCCAGT 


CGTCCAGCTC 


GAGATCGACG 


23040 


CCGGTCGGAT 


TATGGCAGCA 


GGCGTGCAGA 


ACCACGATCG 


AGCGGGCCGG 


CAGGGCATTC 


23100 


AGGTCTTCCA 


GCAGGCCGGC 


GCGGTTCACG 


CCATTGCTGG 


CGGCGTCGTA 


ATAGCGGTAG 


23160 


TTCTGCACCG 


GGAAGCCGGC 


GGCTTCGAAC 


AGTGCGCGGT 


GGTTTTCCCA 


GCTCGGGTCG 


23220 


CTGATGGCCA 


CGGTGGCGTC 


GGGCAGCAGG 


CGCTTGAGGA 


AGTCGGCGCC 


GAGCTTGAGC 


23260 


GCGCCGGTGC 


CGCCGACGGC 


CTGGGTCGTG 


ACCACACGGC 


CGGCGGCCAG 


CAGCTCGGAC 


23340 


TCGTTACCGA 


ACAGCAGTTT 


CTGTACGCCC 


TGGTCGTAGG 


CGGCGATCCC 


TTCGATCGGC 


23400 


AGGTAGCCGC 


GCGGCGCGTG 


GGCCTCGATG 


CGGGCCTTCT 


CGGCAGCCTG 


CACGGCACGC 


23460 


AACAGCGGAA 


TGCGCCCCTC 


CTCGTTGTAG 


TACACGCCCA 


CGCCCAGGTT 


GATCTTGCCC 


23520 


GGACGGGTAT 


CGGCGTTGAA 


GGCTTCGTTC 


AGGCCAAGGA 


TGGGATCACG 


CGGTGCCATT 


23580 


TCGACGGCAG 


AAAACAGACT 


CATTTTGCGG 


CTGCTCGGAG 


TGTGAAGAGA 


GGAGGGCAAC 


23640 


GCAACCCGTT 


ATGCGGGGGC 


GCAAAGGGTT 


GCGCAAACGG 


GGGGTTATTA 


TAGACACCCC 


23700 


TTGATGCATG 


CGGCGACATT 


TAGGTGCATG 


CTTTCAGCTA 


TTTCTGACGC 


CGGATTTTCC 


23760 


TTGGCGTCAC 


AGCTCCCTGC 


GAGGTTTTTC 


ATGGATACGT 


TCCAACTCGA 


CTCGCGCTTC 


23820 


AAGCCCGCCG 


GCGACCAGCC 


GGAAGCCATC 


CGGCAAATGG 


TCGAGGGGCT 


GGAGGCGGGG 


23880 


CTTTCGCACC 


AGACCCTGCT 


GGGGGTGACG 


GGCTCTGGCA 


AGACTTTCAG 


CATCGCCAAC 


23940 


GTGATTGCCC 


AGGTGCAGCG 


CCCGACCCTG 


GTCCTGGCGC 


CGAACAAGAC 


CCTGGCGGCC 


24000 


CAGCTCTACG 


GGGAGTTCAA 


GACGTTCTTC 


CCGCACAATT 


CCGTGGAGTA 


CTTCGTTTCC 


24060 


TACTACGACT 


ACTACCAGCC 


GGAGGCCTAC 


GfCCCGTCTT 


CCGATACCTA 


TATCGAGAAG 


24120 


GACTCCTCGA 


TCAACGACCA 


TATCGAGCAG 


ATGCGCCTGT 


CGGCGACCAA 


GGCGC TGCTC 


24180 


GAGCGTCCGG 


ATGCGATCAT 


CGTCGCCACC 


GTGTCGTCCA 


TCTACGGCCT 


CGGTGATCCC 


24240 


GCGTCCTACC 


TGAAGATGGT 


CCTGCACCTG 


GACCGCGGCG 


ACCGCATCGA 


CCAGCGCGAA 


24300 


CTGCTGCGGC 


GACTG AC CAG 


CCTGCAGTAC 


ACCCGCAACG 


ACATGGATTT 


CGCCCGTGCG 


24360 


ACTTTCCGTG 


TGCGTGGCGA 


TGTGATCGAC 


ATCTTCCCGG 


CCGAATCCGA 


TCTCGAG 


24417 



(2) INFORMATION FOR SEQ ID NO: 2: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 158 amino acids 

(B) TYPE: amino acid 

{ C ) STRANDEDNE S S : s i ng 1 e 
< D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 

(B) STRAIN: PA01 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: rol 



WO 97/41234 



PCT/CA97/00295 



- 105 - 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Ar, Asp lle Glu Gin Ar 9 He Cln As „ Leu Arg Arg Glu ^ ^ ^ 

Ar g Arg Glu Arg Ile val Qln Lys Mu a ^ ^ ^ ^ 

25 30 
Cly Ala Leu Lys ^ Glu Glu pro Leu ^ ^ 

Glu Glu Uu Ser Ala He Met Asn Gly Ser Leu Met Tyr Met Arg Gly 
|er Lys Ala He Met Ala Clu He Gin Thr L| u Glu Ala Ar g Ser Ser 

Asp Asp Pro P he lie Pro Ala Leu Arg ^ ^ 

90 95 
Leu Leu ser Ser Leu Ar g Val Asn Ser Clu Ar g Val Ser Val Ph e Ar g 

Cln Asp Gly Pro He Glu Thr Pro Asp Ser Pro Val Ar f Pr! Ar 9 Ar 9 
Ala Met He Leu He P„ e Gly Leu He He Gly Gly ^1 Leu G!y Gly 
Phe Leu Ala Leu Cys Arc He Phe Leu Lys Lys Tyr Ala Ar g 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 436 amino acids 

(B) TYPE: amino acid 

C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE ■ 

(Vii) IMMEDIATE SOURCE • 
(B) CLONE: psbA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

f t He Asp Val Asn Thr Val Val Glu " Lys P he Lys Ser Ar g Gin Ala 

S " g- °* «-" £r V al „ y ... Pto ^ M „ teu 

- s ^ ^ Ala „. Gly phe J?p ^ ^ » ^ ^ 

V. Lys val Asp Lys Leu S| „ A1 . 01y 01n ^ ^ ^ 

g. in «. Ly , „. „. Ly , A1 , ^ " y ^ t ^ 
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Thr Asp Phe Ser Arg Val Ser Glu Cys Asp Ala Leu lie Leu Cys Val 
85 90 95 



Pro Thr Pro Leu Asn Lys Tyr Arg Glu Pro Asp Met Ser Phe Val lie 
100 105 110 

Asn Thr Thr Asp Ala Leu Lys Pro Tyr Leu Arg Val Gly Gin Val Val 
115 120 125 

Ser Leu Glu Ser Thr Thr Tyr Pro Gly Thr Thr Glu Glu Glu Leu Leu 
130 135 140 

Pro Arg Val Gin Glu Gly Gly Leu Val Val Gly Arg Asp lie Tyr Leu 
145 150 155 160 

Val Tyr Ser Pro Glu Arg Glu Asp Pro Gly Asn Pro Asn Phe Glu Thr 
165 170 175 

Aro Thr lie Pro Lys Val He Gly Gly His Thr Pro Gin Cys Leu Glu 
180 185 190 

Val Gly He Ala Leu Tyr Glu Gin Ala He Asp Arg Val Val Pro Val 
195 200 205 

Ser Ser Thr Lys Ala Ala Glu Met Thr Lys Leu Leu Glu Asn He His 
210 215 220 

Ara Ala Val Asn He Gly Leu Val Asn Glu Met Lys He Val Ala Asp 
225 230 235 240 

Arg Met Gly He Asp He Phe Glu Val Val Asp Ala Ala Ala Thr Lys 
245 250 255 

Pro Phe Gly Phe Thr Pro Tyr Tyr Pro Gly Pro Gly Leu Gly Gly His 
260 265 270 

Cys He Pro He Asp Pro Phe Tyr Leu Thr Trp Lys Ala Arg Glu Tyr 
275 280 285 

Gly Leu His Thr Arg Phe He Glu Leu Ser Gly Glu Val Asn Gin Ala 
290 295 300 

Met P*-o Glu Tyr Val Leu Gly Lys Leu Met Asp Gly Leu Asn Glu Ala 
305 * 310 315 320 

Gly Arg Ala Leu Lys Gly Ser Arg Val Leu Val Leu Gly lie Ala Tyr 
325 330 335 

Lys Lys Asn Val Asp Asp Met Arg Glu Ser Pro Ser Val Glu lie Met 
340 345 350 

Glu Leu He Glu Ala Lys Gly Gly Met Val Ala Tyr Ser Asp Pro His 
355 360 365 

Val Pro Val Phe Pro Lys Met Arg Glu His His Phe Glu Leu Ser Ser 
370 375 380 

Glu Pro Leu Thr Ala Glu Asn Leu Ala Arg Phe Asp Ala Val Val Leu 
385 390 395 400 

Ala Thr Asp His Asp Lys Phe Asp Tyr Glu Leu He Lys Ala Glu Ala 
405 410 415 

Lys Leu Val Val Asp Ser Arg Gly Lys Tyr Arg Ser Pro Ala Ala His 
420 425 430 
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He lie Lys Ala 
435 



(2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS * 

I A) LENGTH: 316 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: peptide 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 

(B) STRAIN: PA01 ginOSa 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: psbB 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Lys Asn P he Ala Leu He ciy Ala Ala Gly Tyr He Ala Pro Arg 

His Met Arg Ala lie Lys Asp Thr Gly Asn Cys Leu Val Ser Ala Tyr 

2b 30 
Asp xi. Asn Asp ser Val Gly jj. Ile Asp Ser ^ ^ ^ ^ 

Glu Phe Phe Thr « u Phe Glu Phe Phe Leu Asp „ is Ser Asn ^ 

" 60 
g. AT, Asp S.r „. Thr A1 . ^ u Asp ^ ^ ^ ne c ^ ^ ^ 

— Tyr Leu His Tyr Pro „ is „. 4U ^ ^ ^ ^ ^ ^ ^ 

AS P W „. Cys C lu L ys Pro ^ u vg Pro Thr Pro 0lu "„ Asp 

«» L.„ „. Val Ile 01 „ ^ ^ ^ ^ ^ ^ ^° a ^ 

"° 125 
L eu Gin Leu Arg His „ is G1 Ala n . Ile ^ ^ ^ ^ 

" 140 
Jl. Aro «» Lys ser p„ His Lys ^ ^ ^ l ^ t ^ ^ ^ 

Thr Ser Arg Gly ?| „ Trp ^ ^ ^ ^ ^ ^ ^ ™ 

«« Ser Phe val ila Tht Gly ^ ^ ^ 

^ IS "* "» <~ J~ «» »r, As„ v.. VJj ^" Phe Thr 

S.r Ol. Tyr Lys Th, A1 , cjy Tyr ^ 01u Tyr g„ "„ Ai. , r , v.! 



WO 97/41234 



PCT/CA97/00295 



- 108 - 

225 230 235 240 

Gly Lys Lys Pro Thr Tyr Arg Ser lie Thr VaT Ash Gly Glu Glu Met 
245 250 255 

Glu Phe Ser Glu Gly Phe Thr Asp Leu His Thr Thr Ser Tyr Glu Glu 
260 265 270 

lie Leu Ala Gly Arg Gly Tyr Gly lie Asp Asp Ala Arg His Cys Val 
275 280 285 

Glu Thr Val Asn Thr He Arg Ser Ala Val He Val Pro Ala Ser Asp 
290 295 300 

Asn Glu Gly His Pro Phe Val Ala Ala Leu Ala Arg 
305 310 315 

(2) INFORMATION FOR SEQ ID NO: 5: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 766 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 

(B) STRAIN: PA01 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: psbC 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Leu Cys Thr Ser Leu Pro Ser Thr Arg Gin Leu Val He Trp Ser 
15 10 15 

Thr Ser Arg Pro Val Cys Val Gly Phe Cys Pro Trp Met Leu Thr Thr 
20 25 30 

Cys Arg Ser Arg Ser Arg Ala Lys Ser Arg Pro He Val Arg Leu Pro 
35 40 45 

Ser Thr Val Arg Lys Trp Ser Ser Leu Lys Ala Leu Pro He Tyr He 
50 55 60 

Gin Pro Ala Thr Lys Lys Phe Ser Leu Val Val Val Met Ala Ser Met 
65 70 75 80 

Thr Leu Val He Val Trp Lys Leu Ser He Pro Phe Ala Ala Pro Ser 
85 90 95 

Ser Tyr Arg Pro Leu He Thr Lys Gly He Arg Ser Ser Arg Arg Leu 
100 105 HO 

Arg Val Glu Val Glu Lys Glu Trp Pro Ser Ser Val Thr Cys Leu Gin 
115 120 125 

Gin Val Ser Ala Gly Ser Phe He Ser Met Ser Ser Ser Ser Ser Lys 
130 135 140 

Leu Leu Asn Gly Met Val Ala Val Ser Ser Gly Arg Asn He Arg Leu 
145 150 155 160 
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Asp V.l G1 „ Gly Leu ^ yal AU ^ v ^ [ _ eu ^ 

Hi, Al. As„ Ser Al. Trp ^ ^ oly phe %i ^ ^ 

185 190 
XII «~ "» Jj; «~ L.u Val „„ ^ 

«» V.l Lys v.l Asp Leu v.l „„ „ Tyr Ala g? Z „. Lys Aro 
lie Phe Pro Al. * Ph. v,l Met Leu Al. „. v.1 Cys He v.l f „ 
Thr He Leu p h e pro Asp Asp ^ ^ ^ ^ ™ 

«» Ser S.r Val Ph. P„e ser s „ A| „ „. ^ ^ ^ ^ ^ ^ 

S " 151 *" "* «• 1» ^ Pro Leu Leu I" Thr ^ 

S.r lie Al. Asn 01. „e t sin Phe Tyr Leu Ph. Tyr p„ Val ^ phe 

300 

«« Cvs Leu Pro Cys Ar g Trp Ar s Leu Pro Val Ph. He Leu Leu Ala 
II. L- Leu Phe XI. Trp ser Gly ^ g. yal phe s<r ^ ™ 

ASP Ala Oln Tyr Phe Al. L.u Leu Ala Ar„ Val Pro Olu Phe Z s.r 

J45 350 
S1V Ala v.l v.1 Ala Leu Ser Leu Ar s Asp Ar s G lu Leu Pro A!a Aro 

Leu Al, n. Leu Al. Cly Leu Leu G ly Ala Al. Leu Leu v.l Cys s.r 

Jg II. Asp Lys G l„ „ i6 Ph. P„ G ly Z Irp s „ Leu _ 

J95 400 
°" ^ SS "* ^ — "« «• «■ «*• Ar 3 G ly Pro Al. 

S.r Leu Leu Leu Ala Ser Ar 8 Pro Met Va! Trp 11. G i y „ ? ") ^ 

Tyr Ser Leu Tyr Leu Trp His Trp Pro He Leu Al. Ph. Z Ar3 Tyr 
Thr G ly G l„ ^ Giu ffi s „ phe ^ ^ ^ ^ ^ ^ ^ 

JHr G ly ser Ph. Leu Leu Al, Trp Phe Ser Tyr Z Tyr He G l„ Thr 
Pro Al. Ar, Lys Ala Val G ly Leu Ar g gj „„ Ai . ^ L „ ^ ^ 

«« Al. Al. ser Va! v.l Ala He Val val Thr oiy G ly Ala Z Ph. 

b0:> 510 
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Asn Val Leu Val Val Ala Pro Ala Pro He Gin Leu Thr Arg Tyr Ala 
515 520 525 

Val Pro Glu Ser He Cys His Gly Val Gin Val Gly Glu Cys Lys Arg 
530 535 540 

Gly Ser Val Asn Ala Val Pro Arg Val Leu Val He Gly Asp Ser His 
545 550 555 560 

Ala Ala Gin Leu Asn Tyr Phe Phe Asp Val Val Gly Asn Glu Ser Gly 
565 570 575 

Val Ala Tyr Arg Val Leu Thr Gly Ser Ser Cys Val Pro He Pro Ala 
580 585 590 

Phe Asp Leu Glu Arg Leu Pro Arg Trp Ala Arg Lys Pro Cys Gin Ala 
595 600 605 

Gin He Asp Ala Val Ala Gin Ser Met Leu Asn Phe Asp Lys He He 
610 615 620 

Val Ala Gly Met Trp Gin Tyr Gin Met Gin Ser Pro Ala Phe Ala Gin 
625 630 635 640 

Ala Met Arg Ala Phe Leu Val Asp Thr Ser Tyr Ala Gly Lys Gin Val 
645 650 655 

Ala Leu Leu Gly Gin He Pro Met Phe Glu Ser Asn Val Gin Arg Val 
660 665 670 

Arg Arg Phe Arg Glu Leu Gly Leu Ser Ala Pro Leu Val Ser Ser Ser 
675 680 685 

Trp Gin Gly Ala Asn Gin Leu Leu Arg Ala Leu Ala Glu Gly He Pro 
690 695 700 

Asn Val Arg Phe Met Asp Phe Ser Ser Ser Ala Phe Phe Ala Asp Ala 
705 710 715 720 

Pro Tyr Gin Asp Gly Glu Leu He Tyr Gin Asp Ser His His Leu Asn 
725 730 735 

Glu Val Gly Ala Arg Arg Tyr Gly Tyr Phe Ala Ser Arg Gin Leu Gin 
740 745 750 

Ara Leu Phe Glu Gin Pro Gin Ser Ser Val Ser Leu Lys Pro 
755 760 765 

( 2 ) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 160 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 

(B) STRAIN: PA01 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: psbD 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ser Tyr Tyr Gin His Pro Ser Ala lie Val Asp Asp Gly Ala Gin 

10 15 

He Gly ser Asp Ser Arg Val Trp His Phe Val His lie Cys Ala Gly 

25 30 

Ala Arg lie Gly Ala Gly Val Ser Leu Gly Gin Asn Val Phe Val Gly 
Asn Lys Val Val lie Gly Asp Arg Cys Lys il e Gin Asn Asn Val Ser 

" 60 

Val Tyr Asp Asn Val Thr Leu Glu Glu Gly Val Phe Cys Gly Pro Ser 

° 75 80 

Met Val Phe Thr Asn Val Tyr Asn Pro Arg Ser Leu lie Glu Arg Lys 

90 95 

Asp Gin Tyr Arg Asn Thr Leu Val Lys Lys Gly Ala Thr Leu Gly Ala 

105 110 
Asn Cys Thr He Val Cys Gly Val Thr lie Gly Glu Tyr Ala Phe Leu 

Gly Ala Gly Ala Val He Asn Lys Asn Val Pro Ser Tyr Ala Leu Met 

135 140 
Val Gly Val Pro Ala Arg Gin He Gly Trp He Ala Asn Ser Val Ser 

155 160 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS' 

(A) LENGTH: 276 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 
(vi) ORIGINAL SOURCE • 

£> S^oT^ 5 aeruginosa 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: psbE 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met He Glu Phe lie Asp Leu Lys Asn Gin Gin Ala Arg He Lys Asp 

Lys He Asp Ala Gly He Gin Arg Val Leu Arg His Gly Gin Tyr lie 

25 30 

Leu Gly Pro Glu Val Thr Glu Leu Glu Asp Arg Leu Ala Asp Phe Val 

40 45 

Gly Ala Lys Tyr Cys He Ser Cys Ala Asn Gly Thr Asp Ala Leu Gin 

" 60 
He Val Gin Met Ala Leu Gly Val Gly Pro Gly Asp Glu Val He Thr 
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65 70 75 80 

Pro Gly Phe Thr Tyr Val Ala Thr Ala Glu Thr Val Ala Leu Leu Gly 
85 - 90 95 

Ala Lys Pro Val Tyr Val Asp lie Asp Pro Arg Thr Tyr Asn Leu Asp 
100 105 110 

Pro Gin Leu Leu Glu Ala Ala lie Thr Pro Arg Thr Lys Ala lie He 
115 120 125 

Pro Val Ser Leu Tyr Gly Gin Cys Ala Asp Phe Asp Ala He Asn Ala 
130 135 140 

He Ala Ser Lys Tyr Gly He Pro Val He Glu Asp Ala Ala Gin Ser 
145 150 155 160 

Phe Gly Ala Ser Tyr Lys Gly Lys Arg Ser Cys Asn Leu Ser Thr Val 
165 170 175 

Ala Cys Thr Ser Phe Phe Pro Ser Lys Pro Leu Gly Cys Tyr Gly Asp 
180 185 190 

Gly Gly Ala He Phe Thr Asn Asp Asp Glu Leu Ala Thr Ala lie Arg 
195 200 205 

Gin He Ala Arg His Gly Gin Asp Arg Arg Tyr His His He Arg Val 
210 215 220 

Gly Val Asn Ser Arg Leu Asp Thr Leu Gin Ala Ala He Leu Leu Pro 
225 230 235 240 

Lys Leu Glu He Phe Glu Glu Glu He Ala Leu Arg Gin Lys Val Ala 
245 250 255 

Ala Glu Tyr Asp Leu Ser Leu Lys Gin Val Gly He Gly Thr Pro Phe 
260 265 270 

He Gly Ser Gly 
275 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 8 amino acids 
(B> TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

<ii) MOLECULE TYPE: peptide 

<vi) ORIGINAL SOURCE: 

(A> ORGANISM: Pseudomonas aeruginosa 

(B) STRAIN: PA01 

(vii) IMMEDIATE SOURCE: 
(3) CLONE: rfc a 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Tyr He Leu Ala Arg Val Asp Arg Ser He Leu Leu Asn Thr Val 
15 10 15 

Leu Leu Phe Ala Phe Phe Ser Ala Thr Val Trp Val Asn Asn Asn Tyr 
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20 25 30 

He Tyr His Leu Tyr Asp Tyr Met Gly Ser Ala Lys Lys Thr Val Asp 



45 



Phe Gly Leu Tyr Pro Tyr Leu Met Val Leu Ala Leu He Cys Ala Leu 



60 



Leu Cys Gly Gly Ala lie Arg Arg Pro Gly Asp Leu Leu Val Thr Leu 



70 75 80 



Leu Val Val lie Leu Val Pro His Ser Leu Val Leu Asn Gly Ala Asn 

85 90 95 

Gin Tyr Ser Pro Asp Ala Gin Pro Trp Ala Gly Val Pro Leu Ala He 

Ala Phe Gly He Leu He He Gly He Val Asn Lys lie Arg Phe His 

Pro Leu Gly Ala Leu Gin Arg Glu Asn Gin Gly Arg Arg Met Leu Val 

■ LJb 140 
Leu Leu Ser Val Leu Asn He Val Val Leu Val Phe lie Phe Phe Lys 

150 155 ifo 

Ser Ala Gly Tyr Phe Ser Phe Asp Phe Ala Gly Gin Tyr Ala Arg Arg 

170 175 

Ala Leu Ala Arg Glu Val Phe Ala Ala Gly Ser Ala Asn Gly Tyr Leu 

185 



190 



Ser Ser lie Gly Thr Gin Ala Phe Phe Pro Val Leu Phe Ala Trp Gly 



205 



Val Tyr Arg Arg Gin Trp Phe Tyr Leu Val Leu Gly He Val Asn Ala 



220 



Leu Val Leu Trp Gly Ala Phe Gly Gin Lys Tyr Pro Phe Val Val Leu 

230 23 5 240 

Phe Leu He Tyr Gly Leu Met Val Tyr Phe Arg Arg Phe Gly Gln Val 

Arg val Ser Trp Val Val Cys Ala Leu Leu Met Leu Leu Leu Leu Gly 

265 270 
Ala Leu Glu His Glu Val Phe Gly Tvr Ser Ph*> t ^ * ^ 

275 280 P Tyr Phe 

285 

Leu Arg Arg Ala Phe He Val Pro Ser Thr Leu Leu Gly Ala Val Asp 

Gin Phe Val Ser Gin Phe Gly Ser Asn Tyr Tyr Arg Asp Thr Leu Leu 

315 320 
Gly Ala Leu Leu Gly Gin Gly Arg Thr Glu Pro Leu Ser Phe Arg Leu 

Gly Thr Glu lie Phe Asn Asn Pro Asp Met Asn Ala Asn Val Z Phe 

345 350 
Phe Ala lie Ala Tyr Met Gin Leu Gly Tyr Val Gly Val Met Aia Glu 

Ser Met Leu Val Gly Gly Ser Val Val Leu Met Asn Phe Leu Phe Ser 
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370 375 380 

Arg Tyr Gly Ala Phe Met Ala lie Pro Val Ala Leu Leu Phe Thr Thr 
385 390 395 400 

Lys lie Leu Glu Gin Pro Leu Leu Thr Val Met Leu Gly Ser Gly Val 
405 410 415 

Phe Leu lie Leu Leu Phe Leu Ala Leu lie Ser Phe Pro Leu Lys Met 
420 425 430 

Ser Leu Gly Lys Thr Leu 
435 

<2> INFORMATION FOR SEQ ID NO: 9: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 316 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 
<B) STRAIN: PA01 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: psbF 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Ser Ala Ala Phe lie Asn Arg Val Ala Arg Val Leu Val Gly Thr 
15 10 15 

Leu Gly Ala Gin Leu lie Thr lie Gly Val Thr Leu Leu Leu Val Arg 
20 25 30 

Leu Tyr Ser Pro Ala Glu Met Gly Ala Phe Ser Val Trp Leu Ser Phe 
35 40 45 

Ala Thr lie Phe Ala Val Val Val Thr Gly Arg Tyr Glu Leu Ala lie 
50 55 60 

Phe Ser Thr Arg Glu Glu Gly Glu Leu Gin Ala lie Val Lys Leu lie 
65 70 75 80 

Leu Gin Leu Thr Leu Leu lie Phe Val Ala Val Ala lie Ala Val Val 
85 90 95 

He Gly Arg His Leu He Glu Ser Met Pro Val Val He Gly Glu Tyr 
100 105 110 

Trp Phe Ala Leu Ala Val Ala Ser Leu Gly Leu Gly He Asn Lys Leu 
115 120 125 

Val Leu Ser Leu Leu Thr Phe Gin Gin Ser Phe Asn Arg Leu Gly Val 
130 135 140 

Ala Arg Val Ser Leu Ala Ala Cys He Ala Val Ala Gin Val Ser Ala 
145 150 155 160 

Ala Tyr Leu Leu Glu Gly Val Ser Gly Leu He Tyr Gly Gin Leu Phe 
165 170 175 
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Gly Val Val Val Ala Thr Ala Leu Ala Ala Leu Trp Val Gly Lys Ser 
180 185 19u 

Leu lie Leu Asn Cys He Glu Thr Pro Trp Arg Met Val Arg Gin Val 
iy => 200 205 

Ala Val Gin Tyr He Asn Phe Pro Lys Phe Ser Leu Pro Ala Asp Leu 

215 220 

Val Asn Thr Val Ala Ser Gin Val Pro Val He Leu Leu Ala Ala Lys 

230 235 240 

Phe Gly Gly Asp Ser Ala Gly Trp Phe Ala Leu Thr Leu Lys He Met 
245 25C 255 

Gly Ala Pro lie Ser Leu Leu Ala Ala Ser Val Leu Asp Val Phe Lys 
260 265 270 

Glu Gin Ala Ala Arg Asp Tyr Arg Glu Phe Gly Asn Cys Arg Gly He 
^ /r> 280 285 

Phe Leu Lys Thr Phe Arg Leu Leu Ala Val Leu Ala Leu Pro Pro Phe 
« u 295 300 

He He Phe Gly Ser Leu Ala Ser Gly Pro Leu Gly 
JUb 3 -° 315 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 118 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 

(B) STRAIN: PA01 

<vii) IMMEDIATE SOURCE: 
<B) CLONE: hisH 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Leu Gly Leu Arg Ser Glu Glu Gly Ala Glu Pro Gly Leu Gly Trp 
5 10 15 

He Asp Met Asp Ser Val Arg Phe Glu Arg Arg Asp Asp Arg Lys Val 
^ U 25 3 0 

Pro His Met Gly Trp Asn Gin Val Ser Pro Gin Leu Glu His Pro lie 

40 45 

Leu Ser Gly He Asn Glu Gin Ser Arg Phe Tyr Phe Val His Ser Tyr 
3U 55 60 

Tyr Met Val Pro Lys Asp Pro Asp Asp He Leu Leu Ser Cys Asn Tyr 



80 



Gly Gin Lys Phe Thr Ala Ala Val Ala Arg Asp Asn Val Phe Gly Phe 
85 90 95 
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Gin Phe His Pro Glu Lys Ser His Lys Phe Gly Met Gin Leu Phe Lys 
. 100. _ _ .. 105 . 110 ... 

Asn Phe Val Glu Leu Val - 
115 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 251 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 
<B) STRAIN: PA01 

(vii> IMMEDIATE SOURCE: 
<B) CLONE: hisF 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Val Arg Arg Arg Val lie Pro Cys Leu Leu Leu Lys Asp Arg Gly 
15 10 15 

Leu Val Lys Thr Val Lys Phe Lys Glu Pro Lys Tyr Val Gly Asp Pro 
20 25 30 

lie Asn Ala lie Arg lie Phe Asn Glu Lys Glu Val Asp Glu Leu He 
35 40 45 

Leu Leu Asp He Asp Ala Ser Arg Leu Asn Gin Glu Pro Asn Tyr Glu 
50 55 60 

Leu He Ala Glu Val Ala Gly Glu Cys Phe Met Pro lie Cys Tyr Gly 
€5 70 75 80 

Gly Gly He Lys Thr Leu Glu His Ala Glu Lys He Phe Ser Leu Gly 
85 90 95 

Val Glu Lys Val Ser He Asn Thr Ala Ala Leu Met Asp Leu Ser Leu 
100 105 110 

He Arg Arg He Ala Asp Lys Phe Gly Ser Gin Ser Val Val Gly Ser 
115 120 125 

He Asp Cys Arg Lys Gly Phe Trp Gly Gly His Ser Val Phe Ser Glu 
130 135 140 

Asn Gly Thr Arg Asp Met Lys Arg Ser Pro Leu Glu Trp Ala Gin Ala 
145 150 155 160 

Leu Glu Glu Ala Gly Val Gly Glu He Phe Leu Asn Ser He Asp Arg 
165 170 175 

Asp Gly Val Gin Lys Gly Phe Asp Asn Ala Leu Val Glu Asn He Ala 
180 185 190 

Ser Asn Val His Val Pro Val He Ala Cys Gly Gly Ala Gly Ser He 
195 200 205 

Ala Asp Leu He Asp Leu Phe Glu Arg Thr Cys Val Ser Ala Val Ala 
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210 215 



220 



Ala Gly Ser Leu Phe Val Phe His Gly Lys His Arg Ala Val Leu He 



240 



Ser Tyr Pro Asp Val Asn Lys Leu Asp Val Gly 
245 250 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 376 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear* 

(ii) MOLECULE TYPE: pep-ide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 

(B) STRAIN: PA01 ruginosa 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: psbG 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Lys lie Cys Ser Arg Cys Val Met Asp Thr Ser Asp Ala Glu He 

10 15 

Val Phe Asp Glu Ala Gly Val Cys Asn His Cys His Lys Phe Asp Asn 

25 30 
Val Gin Ser Arg Gin Leu Phe Ser Asp Ala Ser Gly Glu Gin Arg Leu 

45 

Gin Lys He He Giy Gin lie Lys Lys Asp Gly Ser Gly Lys Asp Tyr 

6 0 

Asp cys lie lie Gly Leu Ser Gly Gly Val Asp Ser Ser Tyr Leu Ala 

75 80 
Val Lys Val Lys Asp Leu Gly Leu Arg Pro Leu Val Val His Val Asp 

Ala Gly Trp Asn Ser Glu Leu Ala Val Ser Asn He Glu Lys He Val 

105 110 

Lys Tyr Cys Gly Phe Asp Leu His Thr His Val He Asn Tr P Glu Glu 

i^u 125 

He Arg Asp Leu Gin Leu Ala Tyr Me, Lys Ala Ala Val Ala Asn Gin 

Asp Val Pro Gin Asp His Ala Phe Phe Ala Ser Met Tyr His Phe Ala 

155 160 
Val Lys Asn Asn lie Lys Tyr He Leu Ser Gly Gly Asn Leu Ala Thr 

170 175 

Glu Ala Val Phe Pro Asp Thr Trp His Gly Ser Ala Met Asp Ala He 

19Q 

Asn Leu Lys Ala He His Lys Lys Tyr Gly Glu Arg Pro Leu Arg Asp 



205 
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Tyr Lys Thr lie Ser Phe Leu Glu Tyr Tyr Phe Trp Tyr Pro Phe Val 

_ 210 . 215. . . 220 . _ ... 

Lys Gly Met Arg Thr Val Arg Pro Leu Asn Phe Men Ala Tyr Asp Lys 
225 230 235 240 

Ala Lys Ala Glu Thr Phe Leu Gin Glu Thr lie Gly Tyr Arg Ser Tyr 
245 250 255 

Ala Arg Lys His Gly Glu Ser lie Phe Thr Lys Leu Phe Gin Asn Tyr 
260 265 270 

Tyr Leu Pro Thr Lys Phe Gly Tyr Asp Lys Arg Lys Leu His Tyr Ser 
275 280 285 

Ser Met lie Leu Ser Gly Gin Met Thr Arg Asp Glu Ala Gin Ala Lys 
290 295 300 

Leu Ala Glu Pro Leu Tyr Asp Ala Asp Glu Leu Gin Phe Asp lie Glu 
305 310 315 320 

Tyr Phe Cys Lys Lys Met Arg lie Thr Gin Ala Gin Phe Glu Glu Leu 
325 330 335 

Met Asn Ala Pro Val Kis Asp Tyr Ser Glu Phe Ala Asn Trp Asp Ser 
340 345 350 

Arg Gin Arg lie Ala Lys Lys Val Gin Met lie Val Gin Arg Ala Leu 
355 360 365 

Gly Arg Arg lie Asn Val Tyr Ser 
370 375 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 

(B) STRAIN: PA01 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: psbH 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Thr Lys Val Ala His Leu Thr Ser Val Kis Ser Arg Tyr Asp lie 
15 10 15 

Arg lie Phe Arg Lys Gin Cys Arg Thr Leu Ser Gin Tyr Gly Tyr Asp 
20 25 30 

Val Tyr Leu Val Val Ala Asp Gly Lys Gly Asp Glu Val Lys Asp Gly 
35 40 45 

Val Arg lie Val Asp Val Gly Val Leu Ser Gly Arg Leu Asn Arg lie 

50 55 60 
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Leu Lys Thr Thr Arg Lys He Tyr Glu Gin Ala Leu Ala Leu Gly Ala 



.80 



Asp Val Tyr His Phe His Asp Pro Glu Leu He Pro Val Gly Leu Arc 
" 90 95 y 

Leu Lys Lys Gin Gly Lys Gin Val lie Phe Asp Ser His Glu Asp Val 

v 105 110 

Pro Lys Gin Leu Leu Ser Lys Pro Tyr Met Arg Pro Phe Leu Arg Arg 



125 



Val val Ala Val Leu Phe Ser Cys Tyr Glu Lys Tyr Ala Cys Pro Lys 

1J!> 140 

Leu Asp Ala Val Leu Thr Ala Thr Pro His lie Arg Glu Lys Phe Lys 

155 160 
Asn He Asn Gly Asn Val Leu Asp He Asn Asn Phe Pro Met Leu Gly 

165 17 0 175 

Glu Leu Asp Ala Met Val Pro Trp Ala Ser Lys Lys Thr Glu Val Cys 

18 = 190 
Tyr Val Gly Gly He Thr Ser lie Arg Gly Val Arg Glu Val Val Lys 

200 205 

Ser Leu Glu Cys Leu Lys Ser Ser Ala Arg Leu Asn Leu Val Gly Lys 

215 220 

Jhe Ser Glu Pro Glu lie Glu Lys Glu Val Arg Ala Leu Lys Gly Trp 

235 240 
Asn Ser Val Asn Glu His Gly Gin Leu Asp Arg Glu Asp Val Arg Arg 

245 2 50 255 

Val Leu Gly Asp Ser Val Ala Gly Leu Val Thr Phe Leu Pro Met Pro 

265 270 
Asn His Val Asp Ala Gin Pro Asn Lys Met Phe Glu Tyr Met Ser Ser 

280 285 

Gly lie Pro Val He Ala Ser Asn Phe Pro Leu Trp Arg Glu He Val 

^" 300 
Glu Gly ser Asn Cys Gly He Cys Val Asp Pro Leu Ser Pro Ala Ala 

3lD 320 
He Ala Glu Ala lie Asp Tyr Leu Val Ser Asn Pro Cys Glu Ala Ala 

330 335 
Ala Leu Gly Arg Asn Gly Gin Arg Ala Val Asn Glu Arg Tyr Asn Trp 

J " 350 
Asp Leu Glu Gly Arg Lys Leu Ala Arg Phe Tyr Ser Asp Leu Leu Ser 



Lys Arg Asp Ser lie 
370 

M INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 362 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 
<ii) MOLECULE TYPE : peptide " 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 

(B) STRAIN: PA01 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: psbl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Lys lie Leu Thr lie lie Gly Ala Arg Pro Gin Phe lie Lys Ala 
1 5 .10 15 

Ser Val Val Ser Lys Ala lie lie Glu Gin Gin Thr Leu Ser Glu lie 
20 25 30 

lie Val His Thr Gly Gin His Phe Asp Ala Asn Met Ser Glu He Phe 
35 40 45 

Phe Glu Gin Leu Gly He Pro Lys Pro Asp Tyr Gin Leu Asp He His 
50 55 60 

Gly Gly Thr His Gly Gin Met Thr Gly Arg Met Leu Met Glu He Glu 
65 70 75 80 

Asp Val He Leu Lys Glu Lys Pro His Arg Val Leu Val Tyr Gly Asp 
85 90 95 

Thr Asn Ser Thr Leu Ala Gly Ala Leu Ala Ala Ser Lys Leu His Val 
100 105 110 

Pro He Ala His He Glu Ala Gly Leu Arg Ser Phe Asn Met Arg Met 
115 120 125 

Pro Glu Glu He Asn Arg He Leu Thr Asp Gin Val Ser Asp He Leu 
130 135 140 

Phe Cys Pro Thr Arg Val Ala He Asp Asn Leu Lys Asn Glu Gly Phe 
145 150 155 160 

Glu Arg Lys Ala Ala Lys He Val Asn Val Gly Asp Val Met Gin Asp 
165 170 175 

Ser Ala Leu Phe Phe Ala Gin Arg Ala Thr Ser Pro He Gly Leu Ala 
180 185 190 

Ser Gin Asp Gly Phe He Leu Ala Thr Leu His Arg Ala Glu Asn Thr 
195 200 205 

Asp Asp Pro Val Arg Leu Thr Ser He Val Glu Ala Leu Asn Glu He 
210 215 220 

Gin He Asn Val Ala Pro Val Val Leu Pro Leu His Pro Arg Thr Arg 
225 230 235 240 

Gly Val He Glu Arg Leu Gly Leu Lys Leu Glu Val Gin Val He Asp 
245 250 255 

Pro Val Gly Tyr Leu Glu Met He Trp Leu Leu Gin Arg Ser Gly Leu 
260 265 270 

Val Leu Thr Asp Ser Gly Gly Val Gin Lys Glu Ala Phe Phe Phe Gly 
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275 280 285 

LyS 290 ^ ° ln Thr Glu Tr P v ^ Glu Leu Val 



«S 3 00 

Thr Cys Gly Ala Asn Val Leu Val Gly Ala Ala Arg Asp Met lie Val 

315 320 
Glu Ser Ala Arg Thr Ser Leu Gly Lys Thr lie Gin Asp Asp Gly Gin 



330 335 



Leu Tyr Gly Gly Gly Gin Ala Ser Leu Gly Leu Leu Asn He Leu Pro 

345 350 



Ser Cys Asp Ala Leu Arg Val Glu Phe Lys 
3S5 360 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 413 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(Vi) ORIGINAL SOURCE- 

!b! ??s;;f M PAo p r eudomonas — 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: psbj 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Asn Val Trp Tyr Val His Pro Tyr Ala Gly Gly Pro Gly Val Gly 

Arg Tyr Trp Arg Pro Tyr Tyr Phe Ser Lys Phe Trp Asn Gin Ala Gly 

Jb 30 
His Arg Ser Val He He Ser Ala Gly Tyr His His Leu Leu Glu Pro 

40 45 
Asp Glu Lys Arg Ser Gly Val Thr Cys Val Asn Gly Ala Glu Tyr Ala 

3D 60 
Tyr Val Pro Thr Leu Arg Tyr Leu Gly Asn Gly Val Gly Arg Met Leu 

Ser Met Leu lie Phe Thr Met Met Leu Leu Pro Phe Cys Leu He Zu 

Ala Leu Lys Arg Gly Thr Pro Asp Ala He He Tyr Ser Ser Pro His 

Pro Phe Gly Val Val Ser Cys Trp Leu Ala Ala Arg Leu Leu Gly Aia 

±zu 125 
Lys Phe Val Phe Glu Val Arg Asp He Trp Pro Leu Ser Leu Val Glu 

x " 140 
Leu Gly Gly Leu Lys Ala Asp Asn Pro Leu Val Arg Val Thr Gly Trp 
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145 150 155 160 

Ile Glu" Arg Phe ~Ser~ Tyr Ala Arg Ala Asp Lys ile lie Ser Leu Leu 
165 170 175 

Pro Cys Ala Glu Pro His Met Ala Asp Lys Gly Leu Pro Ala Gly Lys 
180 185 190 

Phe Leu Trp Val Pro Asn Gly Val Asp Ser Ser Asp Ile Ser Pro Asp 
195 200 205 

Ser Ala Val Ser Ser Ser Asp Leu Val Arg His Val Gin Val Leu Lys 
210 215 220 

Glu Gin Gly Val Phe Val Val Ile Tyr Ala Gly Ala His Gly Glu Pro 
225 230 235 240 

Asn Ala Leu Glu Gly Leu Val Arg Ser Ala Gly Leu Leu Arg Glu Arg 
245 250 255 

Gly Ala Ser Ile Arg Ile Ile Leu Val Gly Lys Gly Glu Cys Lys Glu 
260 265 270 

Gin Leu Lys Ala Ile Ala Ala Gin Asp Ala Ser Gly Leu Val Glu Phe 
275 280 285 

Phe Asp Gin Gin Pro Lys Glu Thr Ile Met Ala Val Leu Lys Leu Ala 
290 295 300 

Ser Ala Gly Tyr Ile Ser Leu Lys Ser Glu Pro Ile Phe Arg Phe Gly 
305 310 315 320 

Val Ser Pro Asn Lys Leu Trp Asp Tyr Met Leu Val Gly Leu Pro Val 
325 330 335 

lie Phe Ala Cys Lys Ala Gly Asn Asp Pro Val Ser Asp Tyr Asp Cys 
340 345 350 

Gly Val Ser Ala Asp Pro Asp Ala Pro Glu Asp Ile Thr Ala Ala Ile 
355 360 365 

Phe Arg Leu Leu Leu Leu Ser Glu Asp Glu Arg Arg Thr Met Gly Gin 
370 375 380 

Arg Gly Arg Asp Ala Val Leu Glu His Tyr Thr Tyr Glu Ser Leu Ala 
385 390 395 400 

Leu Gin Val Leu Asn Ala Leu Ala Asp Gly Arg Ala Ala 
405 410 

(2) INFORMATION FOR SEQ ID NO: 16: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 0 amino acids 

(B) TYPE: amino acid 

{ C ) STRANDEDNESS : s ingle 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 

(B) STRAIN: PA01 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: psbK 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Lys Ala Val Met Val Thr Gly Ala Ser Gly Phe Val Gly Ser Ala 

10 15 

• Leu Cys Cys Glu Leu Ala Arg Thr Gly Tyr Ala Val He Ala Val Val 
ZV 25 30 

Arg Arg Val Val Glu Arg lie Pro Ser Val Thr Tyr He Glu Ala Asp 
. 40 45 * 

Leu Thr Asp Pro Ala Thr Phe Ala Gly Glu Phe Pro Thr Val Asp Cys 

lie He His Leu Ala Gly Arg Ala His He Leu Thr Asp Lys Val Ala 

70 75 80 

Asp Pro Leu Ala Ala Phe Arg Glu Val Asn Arg Asp Ala Thr Val Arg 
85 90 95 M 

Leu Ala Thr Arg Ala Leu Glu Ala Gly Val Lys Arg Phe Val Phe Val 

105 HQ 

Ser ser lie Gly Val Asn Gly Asn Ser Thr Arg Gin Gin Ala Phe Asn 

x20 225 

Glu Asp Ser Pro Ala Gly Pro His Ala Pro Tyr Ala He Ser Lys Tyr 

135 

Glu Ala Glu Gin Glu Leu Gly Thr Leu Leu Arg Gly Lys Gly Met Glu 

155 160 

Leu Val Val Val Arg Pro Pro Leu lie Tyr Ala Asn Asp Ala Pro Gly 

it>b 170 175 * 

Asn Phe Gly Arg Leu Leu Lys Leu Val Ala Ser Gly Leu Pro Leu Pro 

185 190 

Leu Asp Gly Val Arg Asn Ala Arg Ser Leu Val Ser Arg Arg Asn He 

2 °0 205 
Val Gly Phe Leu Ser Leu Cys Ala Glu His Pro Asp Ala Ala Gly Glu 



220 



Leu Phe Leu Val Ala Asp Gly Glu Asp Val Ser He Ala Gin Met He 
Glu Ala Leu Ser Arg Gly Met Gly Arg Arg Pro Ala Leu Phe Thr Phe 

Pro Ala Val Leu Leu Lys Leu Val Met Cys Leu Le U Gly Lys Ala Ser 

265 27C 

Met His Glu Gin Leu Cys Gly Ser Leu Gin Val Asp Ala Ser Lys Ala 

280 285 

Arg Arg Leu Leu Gly Trp Val Pro Val Glu Thr lie Gly Ala Gly Leu 

Gin Ala Ala Gly Arg Glu Tyr He Leu Arg Gin Arg Glu Arg Arg Lys 

315 320 

INFORMATION FOR SEQ ID NO: 17: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 665 amino acids _ 

(B) TYPE: amino acid 

(C> STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 

(B) STRAIN: PA01 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: psbM 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Met Leu Asp Asn Leu Arg lie Lys Leu Leu Gly Leu Pro Arg Arg Tyr 
15 10 15 

Lys Arg Met Leu Gin Val Ala Ala Asp Val Thr Leu Val Trp Leu Ser 
20 25 30 

Leu Trp Leu Ala Phe Leu Val Arg Leu Gly Thr Glu Asp Met lie Ser 
35 40 45 

Pro Phe Ser Gly His Ala Trp Leu Phe lie Ala Ala Pro Leu Val Ala 
50 55 60 

lie Pro Leu Phe lie Arg Phe Gly Met Tyr Arg Ala Val Met Arg Tyr 
65 70 75 80 

Leu Gly Asn Asp Ala Leu lie Ala lie Ala Lys Ala Val Thr lie Ser 
85 90 95 

Ala Leu Val Leu Ser Leu Leu Val Tyr Trp Tyr Arg Ser Pro Pro Ala 
100 105 110 

Val Val Pro Arg Ser Leu Val Phe Asn Tyr Trp Trp Leu Ser Met Leu 
115 120 125 

Leu lie Gly Gly Leu Arg Leu Ala Met Arg Gin Tyr Phe Met Gly Asp 
130 135 140 

Trp Tyr Ser Ala Val Gin Ser Val Pro Phe Leu Asn Arg Gin Asp Gly 
145 150 155 160 

Leu Pro Arg Val Ala lie Tyr Gly Ala Gly Ala Ala Ala Asn Gin Leu 
165 170 175 

Val Ala Ala Leu Arg Leu Gly Arg Ala Met Arg Pro Val Ala Phe lie 
180 185 190 

Asp Asp Asp Lys Gin lie Ala Asn Arg Val lie Ala Gly Leu Arg Val 
195 200 205 

Tyr Thr Ala Lys His lie Arg Gin Met lie Asp Glu Thr Gly Ala Gin 
210 215 220 

Glu Val Leu Leu Ala lie Pro Ser Ala Thr Arg Ala Arg Arg Arg Glu 
225 230 235 240 

lie Leu Glu Ser Leu Glu Pro Phe Pro Leu His Val Arg Ser Met Pro 
245 250 255 
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Gly Phe Met Asp Leu Thr Ser Gly Arg Val Lys Val Asp Asp ^ Gln 

265 270 
Glu Val Asp Ile Ala ^ teu ^ Gly ^ ^ ^ a ^ ^ 

Lys Glu Leu Leu Glu Arg Cys H e Arg Gly Gin Val Val Met Val Thr 

300 

«y Al. Oly G ly Ser II. G l y s.r «u to Cys ^ Gln „. ^ ^ 
cys Ser Pro s.r v.l Leu „. Le- Ph6 Hi5 s . r 01u ^ ^ ™ 

Tyr Ser H. His Gln Glu ^ Glu ^ Lys ^ 21 

345 350 
Ser Val Asn Leu Leu Pro He Leu Gly Ser V al Arg Asn Pro Glu Arg 

Leu Val Asp Val Met Arg Thr Trp Lys Val Asn Thr Val Tyr His Ala 

Ala Ala Tyr Lys His Val Pro lie Val Glu His Asn He Ala Glu Gly 

395 400 
val Leu Asn Asn Val n. Gly Thr Leu His Ala Val Gln Ala Ala Val 

410 415 
Gln Val Gly Val Gln Asn Phe Val Leu He Ser Thr t 

420 i^c Thr Asp L ^ s A la Val 

425 430 

Arg Pro Thr Asn Val Met Gly Ser Thr Lys Arg Leu Ala Glu Met Val 

Leu Gln Ala Leu Ser Asn Glu Ser Ala Pro Leu Leu P he Gly Asp Arg 

Ug Asp Val His His Val Asn Lys Thr Arg Phe Thr Met Val Arg Phe 

475 480 
Oly As» v.! jjy Ser Ser Gly Ser ^ „. „ _ phe ^ ^ 

«» II. Lys Aro oly oly p„ val ^ V41 Thr H . s ^ ^ "I ^ 

Ar, Tyr p ;? „. t Thr „. Prt> gj ^ ^ ^ ^ ^ ^ ^ ^ 

oly S ; m« «, Gln Gly ^ VM phe V41 ^ ™ mm ^ 

Jro V.! , ye He Leu ^ M . olu Lys ^ ^ 

5:55 560 
Leu s.r val Ar„ Ser G lu Ar 9 Ser Pro His Oly Asp He Ala „. „„ 

Phe S.r Gly Leu Ar g Pro ol y Glu y Leu ^ ^ ^ 

b " 590 
Oly Asp As„ V.l Asn pro Thr Asp His Pro Met He Met Ar g A1 . Asn 

605 
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Glu Glu His Leu Ser Trp Glu Ala Phe Lys Val Val Leu Glu Gin Leu 
610 615 620 

Leu Ala Ala Val Glu Lys Asp Asp Tyr Ser Arg Val Arg Gin Leu Leu 
625 630 635 640 

Arg Glu Thr Val Ser Gly Tyr Ala Pro Asp Gly Glu He Val Asp Trp 
645 650 655 

He Tyr Arg Gin Arg Arg Arg Glu Pro 
660 665 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 463 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 

(B) STRAIN: PA01 

(vii) IMMEDIATE SOURCE: 
<B) CLONE: psbN 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met He Asn Ser His Leu Leu Tyr Arg Leu Ser Tyr Arg Gly Thr Ala 
15 10 15 

Arg Arg Met Leu Leu He Lys Lys Gly Lys Pro Leu Pro Met Thr Ser 
20 25 30 

Pro Phe Ser Leu Gin Asp Leu Asp Asp Gly Leu Gly Asp Gly Leu Gin 
35 40 45 

Val Arg Phe Val Gin Arg Gly Asp Ala Asp Thr Ala Gly Ala Asp Gly 
50 55 60 

Val Asp Thr Glu Leu Gly Leu Gin Ala Leu Asp Leu Val Gly Gly Gin 
65 70 75 80 

Ala Gly lie Gly Glu His Ala Thr Leu Ala Thr Asp Glu Thr Glu Val 
85 90 95 

Ala Leu Gly Ala Val Gly Cys Gin Leu Leu Asp His Arg Gin Ala His 
100 105 110 

Val Ala Asp Ala Val Ala His Leu Ala Gin Phe Leu Leu Pro Glu Gly 
115 120 125 

Pro Gin Phe Arg Ala Val Glu His Gly Gly Asp Asp Ala Gly Ala Val 
130 135 140 

Gly Arg Trp Val Arg He Val Gly Ala Asp His Pro Leu His Leu Gly 
145 150 155 160 

Gin His Ala Gly Arg Phe lie Ala Ala Phe Gly His Asp Arg Glu Gly 
165 170 175 
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Ala Asp Ala Phe Ala He Glu Ar 9 Glu Gly Phe Gly Glu Arg Ala Gly 

185 190 

Asn Glu Glu Ala Gin Ala Arg Leu Gly Glu Gin Ala His Arg Gly Gly 

* uu 205 
Val Phe Leu Asp Ala Val Ala Glu Ala Leu Val Gly Asp Val Glu Glu 



220 



Arg His Val Ala Leu Gly Leu Glu His Val Gin His Leu Phe Pro Val 

235 



Val Gin Leu Glu lie Asp Ala Gly Arg lie Met Ala Ala Gly Val Gil 

250 255 
Asn His Asp Arg Ala Gly Arg Gin Gly lie Glr. Val Phe Gin Gin Ala 

265 270 
Gly Ala Val His Ala He Ala Gly Gly Val Val He Ala Val Val Leu 

280 285 

His Arg Glu Ala Gly Gly Phe Glu Gin Cys Ala Val Vai Phe Pro Ala 

295 300 
Arg val Ala Asp Gly His Gly Gly Val Gly Gin Gin Ala Leu Glu Glu 

Val Gly Ala Glu Leu Glu Arg Ala Gly Ala Ala Asp Gly Leu Gly Arg 



33 ° 335 



Asp His Thr Ala Gly Gly Gin Gin Leu Gly Leu Val Thr Glu Gin Gin 

345 350 
Phe Leu Tyr Ala Leu Val Val Gly Gly Asp Pro Phe Asp Arg Gin Val 



365 



Ala Ala Arg Arg Val Gly Leu Asp Ala Gly Leu Leu Gly Ser Leu His 

J/3 380 
Gly Thr Gin Gin Arg Asn Ala Pro Leu Leu Val Val Val His Ala His 

395 400 
Ala Gin Val Asp Leu Ala Arg Thr Gly lie Gly Val Glu Gly Phe Val 

Gin Ala Lys Asp Gly He Thr Arg Cys His Phe Asp Gly Arg Lys Gin 



Thr His Phe Ala Ala Ala Arg Ser Val Lys Arg Gly Gly Gin Arg Asn 

* u 445 

Pro Leu Cys Gly Gly Ala Lys Gly Cys Ala Asn Gly Gly Leu Leu 

455 460 
0 INFORMATION FOR SEQ ID NO: 19: 



(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 238 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 
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(vii) IMMEDIATE SOURCE: 
(B) CLONE: uvrB 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Met His Ala Ala Thr Phe Arg Cys Met Leu Ser Ala lie Ser Asp Ala 
1 5 10 15 

Gly Phe Ser Leu Ala Ser Gin Leu Pro Ala Arg Phe Phe Met Asp Thr 
20 25 30 

Phe Gin Leu Asp Ser Arg Phe Lys Pro Ala Gly Asp Gin Pro Glu Ala 
35 40 45 

lie Arg Gin Met Val Glu Gly Leu Glu Ala Gly Leu Ser His Gin Thr 
50 55 60 

Leu Leu Gly Val Thr Gly Ser Gly Lys Thr Phe Ser lie Ala Asn Val 
65 70 75 80 

lie Ala Gin Val Gin Arg Pro Thr Leu Val Leu Ala Pro Asn Lys Thr 
85 90 95 

Leu Ala Ala Gin Leu Tyr Gly Glu Phe Lys Thr Phe Phe Pro His Asn 
100 105 110 

Ser Val Glu Tyr Phe Val Ser Tyr Tyr Asp Tyr Tyr Gin Pro Glu Ala 
115 120 125 

Tyr Val Pro Ser Ser Asp Thr Tyr lie Glu Lys Asp Ser Ser He Asn 
130 135 140 

Asp His He Glu Gin Met Arg Leu Ser Ala Thr Lys Ala Leu Leu Glu 
145 150 155 160 

Arg Pro Asp Ala He He Val Ala Thr Val Ser Ser He Tyr Gly Leu 
165 170 175 

Gly Asp Pro Ala Ser Tyr Leu Lys Met Val Leu His Leu Asp Arg Gly 
180 185 190 

Asp Arg He Asp Gin Arg Glu Leu Leu Arg Arg Leu Thr Ser Leu Gin 
195 200 205 

Tyr Thr Arg Asn Asp Met Asp Phe Ala Arg Ala Thr Phe Arg Val Arg 
210 215 220 

Gly Asp Val He Asp He Phe Pro Ala Glu Ser Asp Leu Glu 
225 230 235 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 303 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Pseudomonas aeruginosa 

(B) STRAIN: PA01 
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<vii) IMMEDIATE SOURCE: 
(B) CLONE: psbL 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Met He Trp Met He Ala Cys Leu Val Val Leu Leu Phe Ser Phe 

Val Ala Thr Trp Gly Leu Arg Arg Tyr Ala Leu Ala Thr Lys L^u Met 

25 30 
Asp Val Pro Asn Ala Arg Ser Ser His Ser Gin Pro Thr Pro Arg Gly 

Gly Gly val Ala He Val Leu Val Phe Leu Ala Ala Leu Val Trp Met 

" 60 
Leu ser Ala Gly Ser lie Ser Gly Gly Trp Gly Gly Ala Met Leu Gly 

Ala Gly Ser Gly Val Ala Leu Leu Gly Phe Leu Asp Asp His Gly His 

90 95 

He Ala Ala Arg Trp Arg Leu Leu Gly His Phe Ser Ala Ala He Trp 

105 110 

He Leu Leu Trp Thr Gly Gly Phe Pro Pro Leu Asp Val Val Gly His 

120 125 

Ala Val Asp Leu Gly Trp Leu Gly His Val Leu Ala Val Phe Tyr Leu 

AJD 140 
Val Trp Val Leu Asn Leu Tyr Asn Phe Met Asp Gly He Asp Gly He 

155 160 
Ala Ser Val Glu Ala He Gly Val Cys Val Gly Gly Ala Leu lie Tyr 

Trp Leu Thr Gly His Val Ala Met Val Gly He Pro Leu Leu ^ Ala 

18b 19Q 
Cys Ala Val Ala Gly Phe Leu lie Trp Asn Phe Pro Pro Ala Arg He 

Phe Met Gly Asp Ala Gly Ser Gly Phe Leu Gly Met Val ll e Gly Ala 

Leu Ala lie Gin Ala Ala Trp Thr Ala Pro Ser Leu P he Trp Cys Trp 

Leu He Leu Leu Gly Val Phe He Val Asp Ala Thr Tyr Thr Leu H^ 

Arg Arg He Ala Arg Gly Glu Lys Phe Tyr Glu Ala His Arg s^ His 

265 270 
Ala Tyr Gin Phe Ala Ser Arg Arg Tyr Ala S,r me t 

275 280 Ar9 Val Thr 

2 8 5 

Leu Gly Val Leu Ala He Asn Thr Leu Trp Leu Leu Arg Trp His 

* ^ Iftft 
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WE CLAIM: 

1. An isolated P. aeruginosa B-band gene cluster containing the following genes: wzz, 
wbpA, wbpB, wbpC wbpD, wbpE, wzy, wbpF, wbpG, wbpH, wpsh wbp], ivbpK, wbpL, wbpM 
and wbpN involved in the synthesis, and assembly of lipopolysaccharide in P. aeruginosa. 

5 2. An isolated P. aeruginosa B-band gene cluster as claimed in claim 1 wherein the 
genes are organized as shown in Figure 1 (SEQ.ID.NO:l). 

3. An isolated nucleic acid molecule encoding: 

(1) (a) Wzz; (b) WbpA; (c) WbpB; (d) WbpC; (e) WbpD; (f) WbpE; (g) Wzy; (h) 
10 WbpF; (i) WbpG; (j) Wbpl; (k) WbpJ; (1) WbpK; (m) WbpM; (n) WbpH; and (o) WbpN 

involved in P. aeruginosa O-antigen synthesis and assembly; 

(2) UvrB involved in ultraviolet repair; 

(3) HisH or HisF involved in histidine synthesis; 

(4) RpsA, a 30S ribosomal subunit protein SI. 

15 4 . A nucleic acid molecule comprising nucleic acid sequences encoding two or more of the 
following proteins (1) (a) Wzz; (b) WbpA; (c) WbpB; (d) WbpC; (e) WbpD; (0 WbpE; (g) 
Wzy; (h) WbpF; (i) HisH; (j) HisF; (k) WbpG; (1) Wbpl; (m) WbpJ; (n) WbpK; (o) WbpM; 
(p) WbpN; (q) WbpH; (r) WbpL; and (s). RpsA. 

5. A recombinant molecule adapted for transformation of a host cell comprising a 
20 nucleic acid molecule as claimed in claim 3 and an expression control sequence operatively 

linked to the DNA segment. 

6. A transform ant host cell including a recombinant molecule as claimed in claim 5. 

7. An isolated protein characterized in that it has part or all of the primary 
structural confirmation of a protein encoded by a gene of the psb gene cluster as claimed in 

25 claim 1. 

8. A purified protein having the amino acid sequence as shown in Figure 3 or SEQ ID 
NO:2;, Figure 4 or SEQ ID NO:3; Figure 5 or SEQ ID NO:4; Figure 6 or SEQ ID NO:5; Figure 7 
or SEQ ID NO:6; Figure 8 or SEQ ID NO:7; Figure 9 or SEQ ID NO:8; Figure 10 or SEQ ID 
NO:9; Figure 11 or SEQ ID NO:10; Figure 12 or SEQ ID NO:ll; Figure 13 or SEQ ID NO:12; 

30 Figure 14 or SEQ ID NO:13; Figure 15 or SEQ ID NO:14; Figure 16 or SEQ ID NO:15; Figure 
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17 or SEQ ID NO:16; or. Figure 18 or SEQ ID NO:17; Figure 19 or SEQ.ID. No, ,8, or, Figure 
20 or SEQ.ID. No.: 19. ^ 

9- A monoclonal or polyclonal antibody specific for an epitope of a purified protein as 
claimed in claim 8. 

5 10. A method for detecting P. aeruginosa in a sample comprising contacting the sample 
mth a monoclonal or polyclonal antibody as claimed in claim 9 which is capable of bein K 
detected after it becomes bound to protein in the sample. 

11. A method for detecting the presence of a nucleic acid molecule as claimed in claim 3 
ma sample, comprising contacting the sample with a nucleotide probe capable of 
10 hybridizing with the nucleic molecule, to form a hybridization product, under conditions 
winch permit the formation of the hybridization product, and assaying for the 
hybridization product. 

12- A method for detecting the presence of a nucleic acid molecule as claimed in claim 3 
or a predetermined oligonucleotide fragment thereof in a sample, comprising treating the 
sample with primers which are capable of amplifying the nucleic acid molecule or the 
predet ermined oligonucleotide fragment ^ fa ^ ^ ^ ^ 

amplified sequences under conditions which permit the formation of amplified sequences 
and assaying for amplified sequences. 

20 " ... * de,eC,in8 """ !, "° S ° by "" * * O-ntigen 

synthe*, or assemMy * . samplo ^ . ^ ^ 

IT « d T reasen,s rc,uiKd *" bindi " 8 °' *■ Mibod y <° »— » * *• -I* 

and directions for its use. 



14. A kit for detecting the presence of a nucleic acid molecule as claimed in claim 3 in a 
sample comprising a nucleotide probe capable of hybridizing with the nucleic acid 
molecule, reagents required for hybridization of the nucleotide probe with the nucleic acid 
molecule, and directions for its use. 
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FIGURF 7 

BASE COUNT 4990 a 5938 c 7166 g 6323 t ORIGIN 

v S~ £~i -™ ~ :r ™ 

121 ctccactgat cagtgggcaa tcctctgagg agctc-caoc "ggaggagc 

181 tgtatatgcg tggcagtaag gcgattatgg ccgagattc^ llclttaatl 

241 ctgatgatcc ttttattccg gcgctgcgta ctcttca2=» gacattgga 9 gcgcgta 9 ct 

301 gcttgcgtgt taattcggag cgggtttctg tttttco!!a 9CagCagtta "gccgagta 

361 cggactcacc agttcgtcc! aggagagcg! tgattttgat "f^"^ 9 "agaaacgc 

421 gtgtgcttgg tggttttctg gcgttgKcc gLt ""gggctg ataattggtg 

461 aaagagctag ttactgaag? Igtgacgcgt ^ « ^ ItT-*^** 9Ctcg " a ^ 

541 agtaggtttt cgtcgggtgg ctcgat^acr tlltl t "ggtcgagt aattttgtgg 

601 cctcagctct gtctcctg^a c , Saggggtgag aacgtctcca cgcggtgtct 

661 gttgtjggt. t^ES" "* a S«gt:g 

721 agatgccaag ttgcctggca g^tcgttaca tc-ca?! ct ^^tcgg gcgtgcgaga 

781 gcggggtgat ctacagfLt ftgcttaat! ! gtattcgaag "cggcaatc 

841 tcttcgggtg tcaactgoat e«!2«! ^cgcaggc ttggtcaggg tcgagtcggg 

901 t gac t IJcaI tccagggcgg lllTclaatt "f 9 "^ W««« B .t aagctcggct 

961 gtgagctggj caggctga^g gc^g'tcga Illicit aggagcctgc 

1021 atgggcggac cagaatoeco trrrH aa 9cgagtca gcattgtggt ccggaagggc 

1081 cgcSatgat tatglcjccc crcattttta a™?"** "iteecfg 

1141 cagtatccct tccttaSoI lltt I 9 c «ttgcc g ccaggtgctg tggaaagcga 

1201 tcgactgacg latgaa™ ' Cttgt9aag a ^tcgagag tggtcgcaga aaggactcL 

1261 agg^aaa^^ gagg^ J™,* lll^l 

1441 caagcttaat Iccgggcagt gclttltlll a^tatT" atCgatgacg tcaaggttga 
1501 ccgtgcaagc ogtttcoaaa ItlnZtl acata "ccg caagccaaaa ttgctaaggc 
1561 g a g c?tttgt g^ccg^g 9 c^tES." 'tatca" 9 ' ^.tgccct 
1621 caataccacc gacgcactaa Si!" ^tatcgcgag ccggatatga gctttgtcat 
1681 taccacctat ccgggaacta tlalll gCgcgtagg 9 caggtggttt cgctggaaag 

1741 cg t gg ttgg : cllllllttt HSScS ItlllT" -TOtgccct 
1801 gaacttcgag actcgtacca l-tllll ! ttctcc ^ a 9 cgcgaagatc cgggcaaccc 

1861 agtcggcftt gccctgtato ll,! 9 9aCCggtggt cacactcctc agtgtctgga 

1921 g^ccgccgag fcgaccaagc tatted" CgaCC " 9tC ^-ggeca gttcc.cS. 

1981 caacgaaat^ aaga^cgttg ctgategcat llltT^ 9Cgg " aata ^cggtctggt 

2041 tgcggcgacc aagccgttcj gt-tcactcc ^ at " = tgaag tggttgatgc 

2101 ctgtatcccg atcgatccct tctacctoac " aCtaCCCa g 99«gggac tgggegggea 

2161 ccgcttcatc gaactgtctg ttggaaggct cgcgaatacg gaetgeatae 

2221 actcatggat ggcctgaacS faa* 9 ?^ CCaggccatg <=cggaa t acg tactgggcaa 

2281 gggtatcget tataagaaga atjtcgfcaa f!""" 39 ggCagCcg ' g t.ctggt.tt 

2341 ggagctgatc gaagccaajg ^ l!" ' cat 9C9cgag tcgccatccg tggaaatcat 

2401 cccgaagatg cgtjaacac? Icttcglact C9CCt3tagC »«ecgc«tg tgccggtgtt 

2461 ggcta gg ttc ga^ctg^ tg^t^g 9 ^ ^a 9 ^ IIT^TJ 

2521 caaggecgaa gecaaoctao m-^.^J y gac aagtttgacr a-gagctgat 

2581 catStcaag Icccg!".? cc^cccacc llll^T* taCCgCtC « ^eggcaca 

2641 cggatccgct catt^cata IZltlTcll ^1 tt -J^cggg 

270. gctacatcgc tcctcgccat atgcocgcca tcaaaLr,, c 9 c "tcatc ggtgctgccg 

2761 cccatgacat caatgactcg gteggtatta !! eggtaactge ctggtttcgg 

2821 ttaccgagtt cgagttcttt cttXtca^ tt9atagcat "ctccccag agcgagtttt 

2881 cgctggacta cg 9 = 9 tcgat C : - ga g - g cj.QC.jccc caagege tctgctaccg 

2941 gtctgegett gggctgegae otaatrrnm ^acccgcat ategctgeag 

jooi tctc^t. ^jtc nsss ::::n:ic" :^:! a c ::;: 

3061 tgcgccatca ccaggcgatc atcacatt-^ cccc.acaac attctgeaac 

3 !21 a t » gcacg . g9Ccg .? ctg -lllltll ~»J -.e^.. „_« n 
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9301 gactctgaag ataatgggag ctcccatttc cttgttggct gcttcggtgc tcgatgtgtt 
93 61 caaagaacaa gccgctcgtg actaccgaga gtttggtaat tgccgaggca tcttcctcaa 
9421 gactttcagg ttgcttgccg tcctcgcgct acctcctttt atxatatttg gttcattggc 
9481 gagtgggcct ttgggttagt ctttggcgaa gcgtgggctg agtcggggcg ttatgctgta 
9541 ttgatggttc cgttgtttta tatgcgtttc gtggtgagtc cgctcagcta tacaatctat 
9601 attgcccagc ggcagagtat ggatttgttg tggcagctag ccttgttgct cctgacgttt 
9661 atctgtttta ccttgcctga ctctgtcgac tcggtgttgt ggttttactc catagcatat 
9721 gctgttatgt attttgtcta tttctggatg tccttccagc gtgccaaggg agatgccaag 
9781 cgatcgttgr tattgattac ggtgtaggta acattgcttc agtcttgaac atgctgaagc 
9841 gagttggtgc caaagccaag gcatccgata gccgagagga tatcgagcag gcggagaaac 
9901 tgattttgcc tggtgtcggt gcttttgacg ccggaatgca aacactacgc aagagtgggc 
9961 tggtggatgt actgacagag caggtcatga tcaaacgaaa gccggtcatg ggggtgtgtc 
10021 tcgggagtca gatgctgggg ctgcgatctg aggagggagc ggaaccgggg cttggatgga 
10081 tcgatatgga tagcgtccgt ttcgaaaggc gtgacgaccg aaaggttcca catatgggct 
10141 ggaatcaagt gtccccgcaa ttggagcatc ctatacttag cggtataaac gagcaaagcc 
10201 gattctattt tgttcatagt tattatatgg ttccgaaaga cccagacgat atcctgttga 
10261 gttgtaatta tggacaaaaa ttcactgcgg cggtggctcg ggataatgtt ttcggatttc 
10321 agtttcatcc tgagaagagt cataaattcg gtatgcagtt attcaaaaac ttcgtggagc 
103 81 ttgtctgatg gtccggaggc gcgttatccc atgcctgctg ctcaaggatc gcggtctagt 
10441 gaaaaccgtg aagttcaagg agcccaagta cgttggagac ccgatcaacg caatacgcat 
10501 cttcaatgag aaagaagtcg acgaactgat tttgctggat atagatgctt ccaggctcaa 
10561 ncaagagcct aaccatgagt tgatcgcgga agtggctggt gagtgcttta tgcctatttg 
10621 ctatgggggc ggtatcaaga cattggagca tgcggaaaaa atcttttccc taggtgtcga 
10681 aaaagtttcg ataaataccg ccgctcrtat ggatctttcg ttgattcgaa gaattgccga 
10741 taagtttggt tcgcaaagcg tagttggctc tatcgactgc cgcaagggtt tctggggagg 
10801 acactccgtg ttctcagaga atgggacgcg cgacatgaaa cgctccccat tggagtgggc 
10861 gcaagcgctc gaagaggctg gagtgggtga gatttttcta aattctattg atcgagatgg 
10921 agtgcagaaa ggcttcgaca acgctctagt ggaaaatatc gcttctaacg tccatgtgcc 
10981 agtgatcgcc tgtggtggag ctggctccat cgctgacctc atcgatcttt ttgagcgtac 
11041 gtgtgcgtcg gcagtagcgg cgggaagcct attcgttttc catggcaagc arcgtgcggt 
11101 actgattagt tatccggatg tcaacaagct cgacgtcggt tagagtgagc tgagttattt 
11161 atggcaagga cgcttgttgg caacgctata tgcgcttcaa gattgtcgaa ctaaatttga 
11221 gtttgtcagt ggggcgttcc attaggcagg ccgaggtgag tgcttcggga ggttgttgtg 
11281 atgaagatct gttcgcgctg tgttatggat acatctgacg ctgaaatcgt atttgatgag 
11341 gcgggagtct gtaatcactg ccacaaattt gacaatgttc agccccggca gctgttttcc 
11401 gatgctagtg gtgagcagcg ccttcaaaag ataattgggc agatcaagaa ggacggttca 
11461 ggtaaggatt atgactgcat cattggcctt agtggcggcg tagatagttc ctatcttgct 
11521 gtaaaggtca aggatcttgg cttgcgccca ctggttgtgc atgtggacgc cggctggaat 
11581 agcgaacttg cagtcagtaa tattgaaaag attgtaaaat attgcggttt tgatttacat 
11641 actcatgtaa taaactggga ggaaattcgt gatcttcagt tggcttatat gaaagctgct 
11701 gtcgccaatc aggatgtgcc tcaagatcat gccttcttcg ctagtatgta tcacttitgct 
11761 gtgaagaata atattaagta cattctgagt ggtggtaatt tggccactga ggcagtattc 
11821 ccagatacat ggcacggcag cgctatggat gcaataaacc taaaggctat tcacaaaaaa 
11881 tatggtgagc gtccgctaag ggactacaag actattagtc ttcttgagta ctatttctgg 
11941 tatccctttg tcaaaggaat gagaacggtc cgtccgttga atttcatggc ctatgataag 
12001 gccaaggctg aaaccttcct tcaagaaacg ataggctatc gttcrtacgc gcgaaagcat 
12061 ggagagtcga ttttcaccaa gcttttccag aactactatc taccgaccaa gtttggctat 
12121 gataaacgca aactgcacta ctccagcatg at rttgtccg ggcaaatgac gcgngacgaa 
12181 gctcaggcta aactggctga gccgctatat gatgcagacg aactgcagtt tgatatcgaa 
12241 tatttctgca agaagatgcg aatcacccag gctcaarttg aagagttgat gaatgcacct 
12301 gttcatgact atncggagtt tgccaactgg gattctcgac agaggattgc gaaaaaagtt 
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ss as ssss =2= s=5 ™" 

i26 61 ggggctgatg cctatcattt tcatgatcc % c 2::^: : 

"m f 9C39gtCat CttC9actcc catga gg a tg tgccgalgL a"c5£££ 

s == ill 111 ss = === 

13021 tccattcgtg gtattcStga aotca^ aCt93agtCt ^"acgtcgg tggcatcact 

iss =22 £- = =s sssz 
ssi 22nh .4sR -™ 22s- 

S£ 2222 =S T *™ = ™- 

13681 aaatattttt 2 ggtattccaa "f^" 9 " """"^ "t.tgtctg 

13741 gtactcacgg ccaaatoHf lilt Wcggatta ccagttggat atccatggtg 

13801 agaaacctca Sc« a ^ taatggagat cgaggatgta attctcaagg 

13861 C g c ?e a i: C99C9 3taCCaaCtc taccttggct: ggagcgtcjg 

13921 tgcggacgcc Igajgaaa^ lilt 1*°™ aCatCgaagc cggcctgcga agtttcaata 
13981 gccctactcg £«a~I~ aa * C9tattC c ' acc 9 at « 99t t ag C gat attctgtttt 
14041 aga^agtcal Jgtgggtgat SSE"" a9a3t9aagg '"cg-ag. aaggctgcga 
14101 cctcgccaat tggactto™ f^ 9 " 9 " 59 ^^agcgctct attctttgcg cagcgtgcaa 
14161 agaacaccga cgatccag't 3? W " tattct ^cg.ccctg caccgtgccg 
14221 ttaatgctgc a L «™ ^ c «gactt cgatagtcga ggccccgaat gaaatccaga 
14281 t aggg^ c t9 : gc^a"^ cag^ttaMg ItlttT^ "cgagcgcc 
14341 tgttgcaacg ctctoccL! 'Ill I atcctgtcgg atatctggaa atgacctggc 

14401 t «tc gg a g " 9 aca9Cg ^ ^tcagaaa gaagcattct 

14461 gtggagccaa cottctt«S a = Cat 9 c * c 9 accagaccga atgggcggag ctagtgacct 
14521 gcSgg^" gaccattcaa JSST 9CgaCat9at ^^cgaarct gcacggacta 
14581 gattgctgaa £tc«««I gac * at ^tc agctttacgg aggcggtcaa gcctctctcg 
14641 atttagtKc atgaacotct ^T*"* "ttgcgcgt cgagtttaaa caaaggattt 
14701 ttatt^gcgg ?ct" 9t9Ca tCC « 3t9Ct ^cggccccg gagttggtcg 

14761 aatct ™ !" CCaa9tt "W'cag gctgggcatc gg tcggtcat 

iss 222: S222 HHS ™" =ss= «~ 
iss s~ ~? = =ss 222: 

1S061 ctggcctctg agtctaot™ III? " tgcgaaattc 9tat:tgagg tgcgcgacat 

SiS 2222 2222 °™ »~ 

iss 22X ~ ™ -™ *™ s=s= 
»». ^-l. 22.12 -™ 2:22: 2222 2222 
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15421 tgcaagtatc agaatcattc tggtgggcaa gggagagtgc aaagagcaac tcaaggcgat 
15481 tgccgcacag gatgccagcg ggctagtgga gtttttcgat cagcagccca aagagaccat 
15541 catggctgtc ctgaagctgg cgtcggcggg ctacatctcg ctcaagtcag aaccgatctt 
15601 ccgctttggc gtgagcccca acaagctatg ggattacatg ctggttgggt tgccagtcat 
15661 tttcgcctgc aaggcaggga acgacccggt tagtgactac gattgcggtg tatctgccga 
15721 cccagatgcc cctgaggata ttactgcagc catcttccgt ctgttgctgc tgagcgaaga 
15781 cgagcgccgc acaatggggc aaagagggcg tgatgcggtc ctggagcatt atacctacga 
15841 gagtctggct cttcaggtgt tgaacgccct cgctgatggg cgcgcagcat gaaagctgtc 
15901 atggtgaccg gtgcatcagg attcgtcgga tcggccttgt gctgtgagct tgctcggaca 
15961 gggtatgcgg tgattgcggt ggtacggcgg gttgttgaaa gaataccttc tgtgacgtac 
16021 atcgaagctg atctgaccga tccagccacg tttgccggcg agttcccgac ggtggattgc 
16081 attattcatc tcgctggacg tgcccatata ctcactgaca aggttgcaga cccgcccgcc 
16141 gcatttcgtg aagtcaaccg agatgcgact gtccggttgg ctacccgtgc gctcgaggct 
16201 ggggtgaagc gtttcgtgtt tgtcagttca attggcgtta acggtaacag cacccggcaa 
16261 caggctttca acgaagattc tccagccggc ccacatgcgc cctatgccat ctccaaatac 
16321 gaggctgagc aggagctggg gactttgctc cggggtaaag gtatggagtt ggtggttgtc 
163 81 cgaccgcctt tgatctatgc caatgatgcg ccaggtaact tcggccgttt gctcaagctc 
16441 gtcgctagtg gtctgccgct tccgcttgac ggtgtccgta atgcgcgcag cctggtttct 
16501 aggagaaaca tcgtgggttt cctgagtctt tgtgccgaac accccgatgc tgcgggcgaa 
16561 ctgtttctgg tggcggatgg cgaggatgtt tccattgcgc aaatgatcga ggccctgagt 
16621 cggggaatgg gcaggcgtcc agctcctttc acgtttccag cggtgctgct gaagcttgta 
16681 atgtgcttgc tgggtaaggc ttccatgcat gaacagctct gtggctcgtt acaggtcgat 
16741 gcttccaagg cccgccggct gctcggctgg gttcccgtcg agactartgg cgccggtctg 
16801 caagcagcag gtcgagagta cattcttcgc cagagggagc gccgaaaatg acggacacat 
1686j. ccaaacccct ggtcggcaat tacgctgaac tttaataagt tctctttcca atgatgatct 
16921 ggatgatcgc gtgtctagtt gtcttgctgt tttcatttgt cgctacctgg gggctgcgtc 
16981 gctatgcatt agcgacgaaa ctgatggatg ttccgaatgc ccgtagctcc cacagtcaac 
17041 cgacgcctag ggggggaggt gttgcaatcg ttctggtctt ccttgcagcg ttggtgtgga 
17101 tgctgagtgc aggcagtatc tccggcggct gggggggggc gatgctgggt gcaggttctg 
17161 gcgtggcact gttagggttc ctggatgacc atgggcacat tgctgcgcgt tggcggctgc 
17221 tcggccattt ctcagcagcg atatggatct tgctgtggac gggtggtttc ccgccgctgg 
17281 atgtggttgg gcatgctgtc gacttaggat ggctgggcca cgtattggca gttttctatt 
17341 tggtatgggt gctgaacctt tataacttca tggatggcat tgatggtatt gccagtgtcg 
17401 aggccattgg cgtctgtgta ggaggggccc tgatctactg gcttacaggg catgtcgcga 
17461 tggttggtat ccctctgttg ctggcgtgcg cggccgccgg cttcctgatc tggaacttcc 
17521 ctccagctcg aatcttcatg ggtgatgcgg ggagtggttt tcttggtatg gttattggtg 
17581 cactagctat tcaggctgca tggaccgccc cctcgctgtt ctggtgctgg ttgatatrgc 
17641 tgggagtgtt catcgttgat gcaacctata ctctgatccg ccggatcgcc agaggggaga 
17701 aattctatga ggcgcatcgc agccacgctt atcagtttgc ctcgcgccgt tatgctagcc 
17761 atctgcgggt taccttgggt gttctggcta tcaacactct ttggttgttg cgttggcact 
17821 gatggttgca ttgggttgga tcagcggctt catcggtatc ctggttgctt atgctcctct 
17881 ttgcctcttg gcggtaggat acaaggcggg ttccttggaa aaatcctaag ccgtggatto 
17941 acctgctccc cgatttcagt accacgccga acttagtaga gtctgttttc cgagcaggag 
18001 acggcagtga aaaagcgttt tactgaagaa cagattctag actttctgaa gcaggcagaa 
18061 gccggtgtgc cggtgaagga gctgtgtcgc cgacacagct tcagtgatgc cacgttctac 
18121 acctagcggg ccaagttcgt cggcatgacc gtgccggatg ccaagcgcct gaaggatctc 
18181 gaactggaaa acagccggct gaagaagttg ctcgccgagt ccctcctcga catcggggcg 
ilH 2 ; ccgaaagtgg tcacccgggg aaagggggag cccggcagcg gggcgggggg gcaggagatt 
18301 caggcgcaaa ccgacatctc cgagcgtcgt gccctgtcag ttgttcaggc tgtcccgctc 
" tgCgtt9tgc "ccagccgc gaactagtgt gcaaaacacc gagctgcaag cccaactggt 
184..1 ggaactggca agggcttcgg cactttggct atcaccgcct gcacattctg ctgcogcgtg 
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18481 ctggtgtgca gatcaactac aagcggactt acr flfl rf^ a ^ 

18541 tgaagcggcg gaggcgccgc cacaggggcg cgg^gcgto cg 9 *?"^ " C " 9at " 

I860! gcgcaccgaa ccaggtcttg tcgatjjat? tcgt^ttcof lit T 9 a9c « 9CC9a 

18781 gcSScgca cccccg^tt f"' 9 ""* "^atggcac ggctgc^g 

as =25 sri ~ FF = - 

i»« ™« F- ~- =e ~" 

19201 ctggga"tt ccc'acccat gt^r"' 9Ct9t " C9a tccgtgcatg tggaactgat 

iSi sssss S F = °™ 

15501 °=g£g«cc "Wtwet M^tHt. 

s«i ssss ~ Ft? °™< ™*" esse 
iss ~ H S F ~ ~« 

19801 ggcacajaag acatfatcag ™ ! ' CC «f" c t«c«eett ggtcaggttg 
19861 tcggcggcca ttcccctgtt catccLH ^atgcct ggctgttcat cgccgccccg 
19921 ggcaacgacg cccttatcgc «tc«£ W«^«cc gggcggtgat gcgctacctg 
19981 ttgctggtct actigtlcco ItltZ 9 9CC9tCacca "tccgcgct ggtcctgtcg 
20041 tactgg^ggc tg^"? ^t^T* 9C " C " t9C <^«ccct ggtgttcaac 
20101 a t ggl 9 g 99 t £S"£ SS^Sg 

20161 cccagggtco cta r rfflr l 9 gCC9 ^taccatttc tcaaccgcca ggatggcctg 
20221 c tCg S S S ££££ ™Z? ^ 
202 81 gtcatcgccg gtctgcgggt cta?l™ atCgat9atg a = aa 9«gat cgccaaccgg 
20341 ggcgcgLgJ Jgg-tc-cct aa 9"taccc Sccagatgar cgacgagacg 

20401 ctcgajtccc tggagccltt ^ ' tccgccactc gggcccggcg ccgagagatt 

20461 accagcggcc gggtcaajgt g^cga"!^ 9t9C9CagCa ^cccggctt catggacctg 
20521 cgcgacagcg Sac^ «»»cgacctg caggaggtgg acatcgctga cctgctgggg 

20581 atgltgacS gggcgggcg^ l^lllTJ t^Tl 

20641 tcgcctagcg tgctgatcct atLt! tcggaactct gtcggcagat catgagttgt 
20701 gaactggag? «c«itSI ?" C9a9CaC ^Wc* acctctacag catccatcag 
20761 Lggcl?: 9 : Lcccgagcg ££52 SSSSS acc.g.cgcc ^cctcgg 9 
20821 taccatgcgg cggcctacaa atlrlll S^gatgcgta cctggaaggt caataccgtc 
20881 ctcaacaa?g cttgca?^ 9 ^f^ 3 ^ aCaacat ^ cgagggcgtt 

20941 aacttcgtgc tgatt^cac 2-?! 9C 9 ca 9gccg cggtgcaggt cggcgtgcag 

21001 aagcgc?tlg cfgaga^ggt 9C9C9aCCga ccaatgtgat gggcagcacc 

21061 ggcg! tC gga SSSeJS tcac«caal " Ca9CaaC9 "tcwc.cc g.tgctgttc 
21121 aacgtcctcg gttcgtccgg tc « tcacaacggt ccgcctcggc 

2H81 ggcccgg tg 9 ^llVcll c^gc"" ^f 9 ! 9 " 9 " "agcgcggc 

21241 gcgcagttgg tcatccaggc cggttcgato l ca l c tc "gaccat tcccgaggca 
21301 atgggg CCgc C gg t gaaga t cll^ltl SSSS JS!^ 

JS5 c% 9 t %™ 322 S a 9 "™ S tC9 ™ = " 

21481 ccga t g atC a cgcgggccaa ^llll ™~ a 
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21541 gagcagttgc tggccgccgt ggagaaggac gactactcgc gggttcgcca gttgctgcgg 
21601 gaaaccgtca gcggctatgc gcctgacggt gaaatcgtcg actggatcta tcgccagagg 
21661 cggcgagaac cctgagtcat cgttctccgg aaaaggccgc ctagcggcct tttttgtttt 
21721 ctccgtacga tgtttccggt gccggaccag gaagcgactg ctttgctggg gctgtcgatc 
21781 caggtgcgtt ccacggcgat aaggtggttc cgtggatggg catgaagccc tctacgtggt 
21841 cattcatctc tgaaggagtg cacccatgca cctaatcaaa tccgctctgc ttctcatcct 
21901 gctcgcctgt cttccgtttt cggcttccgc cgcaccggtc gccgtcgcca agaatccgct 
21961 ggccgcaacg acacctgcga cgaccgtgtc gccgggggag caggtcaata tcaatacggt 
22021 cgacgaggcc gccctgatac gggggctcaa cggtgtcggc gaggccaagg ccagggcgat 
22081 cctcgagtat cgtgcggccc atggtccgtt cgtctcggtg gatcaactgc tggaagtgaa 
22141 aggggtaggc ccggcgttgc tggagaagaa ccgggcgcgg atcgccatcg agtgaggtgc 
22201 gactgaaggg gcgaactttc gtcccgataa cgaaaaagcc cccggcatgt gccgagggct 
22261 ttgaatttgg ctccgcgacc tggactcgaa ccagggaccc aatgattaac agtcatttgc 
22321 tctaccgact gagctatcgc ggaacagcga ggcgtatgtt actgattaaa aaggggaagc 
22381 ctctcccgat gacrtcccca ttttccctac aggacctgga cgatggcctt ggtgatggtc 
22441 tccaggttcg atttgttcag cgcggcgacg cagatacggc cggtgctgac ggcgtagata 
22501 ccgaactcgg tcttcaggcg ctcgacctgg tcggcggtca ggccggaata ggagaacatg 
22561 ccacgttggc gaccgacgaa actgaagtcg cgcttggcgc cgtgggctgc cagttgctcg 
22 621 accatcgcca ggcgcatgtc gcggatgcgg tcgcgcatct cgcccagttc ctgctcccag 
22 681 agggcccgca gttccgggct gttgagcacg gaggagacga cgctggcgcc gtgggtcggt 
22741 gggttcgaat agttggtgcg gatcacccgc ttcacctggg acagcacgcg ggccgattca 

22 801 tcgcggcttt cggtcacgat cgagagggcg ccgacgcgtt cgccatagag cgagaaggat 
22861 ttggagaacg agctggaaac gaagaagctc aggcccgact gggcgaacag gcgcaccgcg 
22921 gcggcgtctt cctcgatgcc gttgccgaag ccctggtagg cgatgtcgag gaacggcacg 
22981 tggcccttgg ccttgagcac gtccagcacc tgttcccagt cgtccagctc gagatcgacg 

23 041 ccggtcggat tatggcagca ggcgtgcaga accacgatcg agcgggccgg cagggcattc 
23101 aggtcttcca gcaggccggc gcggttcacg ccattgctgg cggcgtcgta atagcggtag 
23161 ttctgcaccg ggaagccggc ggcttcgaac agtgcgcggt ggttttccca gctcgggtcg 
23221 ctgatggcca cggtggcgtc gggcagcagg cgcttgagga agtcggcgcc gagcttgagc 
2 32 81 gcgccggtgc cgccgacggc ctgggtcgtg accacacggc cggcggccag cagctcggac 
23341 tcgttaccga acagcagttt ctgtacgccc tggtcgtagg cggcgatccc ttcgatcggc 
23401 aggtagccgc gcggcgcgtg ggcctcgatg cgggccttct cggcagcctg cacggcacgc 
23461 aacagcggaa tgcgcccctc ctcgttgtag tacacgccca cgcccaggtt gatcttgccc 
23521 ggacgggtat cggcgttgaa ggcctcgttc aggccaagga tgggatcacg cggtgccatt 
23581 tcgacggcag aaaacagact cattttgcgg ccgctcggag tgtgaagaga ggagggcaac 
23641 gcaacccgtt atgcgggggc gcaaagggtt gcgcaaacgg ggggttatta tagacacccc 
23701 ttgatgcatg cggcgacatt taggtgcatg ctttcagcta tttctgacgc cggattttcc 
23761 ttggcgtcac agctccctgc gaggtttttc atggatacgt tccaactcga ctcgcgcttc 
23821 aagcccgccg gcgaccagcc ggaagccatc cggcaaatgg tcgaggggct ggaggcgggg 
23881 ctttcgcacc agaccctgct gggggtgacg ggctccggca agactttcag catcgccaac 
23941 gtgattgccc aggtgcagcg cccgaccctg gtcctggcgc cgaacaagac cctggcggcc 
24001 cagctctacg gggagttcaa gacgttcttc ccgcacaatt ccgtggagta cttcgtttcc 
24061 tactacgact actaccagcc ggaggcctac gtcccgtctt ccgataccta tatcgagaag 
24121 gactcctcga tcaacgacca tatcgagcag atgcgcctgt cggcgaccaa ggcgctgctc 
24181 gagcgtccgg atgcgatcat cgtcgccacc gtgtcgtcca tctacggcct cggtgatccc 
24241 gcgtcctacc tgaagatggt cctgcacctg gaccgcggcg accgcatcga ccagcgcgaa 
24301 ctgctgcggc gactgaccag cctgcagtac acccgcaacg acatggattt cgcccgtgcg 
24361 actttccgtg tgcgtggcga tgtgatcgac atcttcccgg ccgaatccga tcccaao 

// * y 
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CDS <1..479 

/gene="wzz (ro!)" 
/codon_start=3 
/product="VVzz (Rol)" 
/db_xre/=PID:gl545846 M 
/transl_table=ll 

/translation="RDIEQRIQNLRRECQGRREDRIVQLKEALKVAGALKLEEPPLIS 

GQSSEELSAIMNGSLMYMRGSKAIMAEIQTLEARSSDDPFIPALRTLQEQQLLLSSLR 

VNSERVSVFRQDGPIETPDSPVRPRRAMILIFGLIIGGVLGGFLALCRIFLKKYAR" 
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CDS 1286.. 2596 

/gene= "wbpA" 
/codon_start=l 
/ p r oduc t = B WbpA " 
/db_xref = " PID: gi545847 " 
/ trans l_table= 11 

/ translation^ "MIDVNTWEKFKSRQAL1GIVGLGYVGLPLMLRYNAIGFDVLGI 

DIDDVKVDKLNAGOCYIEHIPQAKIAKARASGFEATTDFSRVSECDALILCVPTPLNK 

YREPDMSFVINTTDALKPYLRVGQWSLESTTYPGTTEEELLPRVQEGGLWGRDIYL 

VYSPEREDPGNPNFETRTIPKVIGGHTPQCLEVGIALYEQAIDRWPVSSTKAAEMTK 

LL ENI HRAVN I G LVNEMK I VADRMG I D I F EWDAAATKPFGFT P Y Y PG PGLGGHC I P I 

DPFYLTWKAREYGLHTRFIELSGEVNQAMPEYVLGKLMDGLNEAGRALKGSRVLVLGI 

AYKKNVDDMRESPSVEIMELIEAKGGMVAYSDPHVPVFPKMREHHFELSSEPLTAENL 

ARFDA WL ATDHDK FDYEL I KAEAKLWDS RGKYRS PAAH 1 1 KA " 
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C3 S 2670.. 3620 

/gene= "wbpB" 
/codon_scart=l 
/products "WbpB" 
/db_xref="PID:gl545848" 
/ trans l_table=ll 

/ translations "MKNFALIGAAGYIAPRHXRAIKDTGNCLVSAYDINDSVGIIDSI 

SPQSEFFTEFEFF^DHASNLKRDSATALDYVSICSPNYLHYPHIAAGLRLGCDVICEK 

PLVPTPEMLDQLAVIERETDKRLYNILQLRHHOAIIALKDKVAREKSPHKYEVDLTYI 

TSRGNWYLKSWKGDPRKSFGVATNIGVHFYDMLHFIFGKLORNWHFTSEYKTAGYLE 

YEQARVRWFLSVDANDLPESVKGKKPTYRSITVNGEEMEFSEGFTDLHTTSYEEILAG 
RGYGIDDARHCVETVNTIRSAVrVPASDNEGHPFVAALAR" 
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CDS 3689.. 5578 

/gene="wbpC" 
/codon_start=l 
/produc t =r " WbpC " 
/db_xref="PlD:gl545849" 
/transl_table=ll 

/ translation= " MSSS SSKLLNGMVAVSSGRNI RLDVQGLRAVAVLAVLAYHANS A 
WLRAGFVGVDVFFVI SGF I ITALLVERGVKVDLVEFYAGRI KRIF P AYFVMLAI VC IV 
STILFLPDDYVFFEKSLQSSVFFSSNHYFANFGSYFAPRAEELPLLHTCSIANEMQFY 
LFYPVLFMCLPCRWRLPVFILLAILLFIWSGYCVFSGSQDAOYFALLARVPEFMSGAV 
VALSLRDRELPARLAILAGLLGAALLVCSFIIIDKOHFPGFWSLLPCLGAALLIAARR 
GPASLLLASRPMVWIGGISYSLYLWHWPILAFIRYYTGQYEIjSFVALLAFLTGSFLLA 
WF S YR Y I ETPAR KA VGLRQQALKWMLAA S WA I WTGGAQFNVL WA PA P I QLTRYA V 

PESICHGVOVGECKRGSVNAVPRVLVIGDSHAAQLKYFFDWGNESGVAYRVLTGSSC 

vpipafdlerlprwarkpcqaqidavaosmlnfdkiivagmwqyqkqspafaoamraf 

LVDTSYAGKQVALLGQI PMFESNVQRVRRFRELGLSAPLVSSSWQGANQLLRALAEG I 

pnvrfmdfsssaffadapyqdgeliyqdshhlnevgarrygyfasrqlorlfeqpqss 

VSLKP " 
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C ^S 5575.. 6066 

/gene= M wbpD" 
/codon__ start =1 
/product- "WbpD" 
/db_xref = "PID : gl545850 " 
/ trans l_tabie= 11 

/ translations "MSVYQHPSAIVDIXjAQ I GSDSRVWHFVH ICAGAR I GAGVSLGQN 
VFVGNKWIGDRCKIQNWSVTO^ 

LVKKG ATLGANCT I VCGVT I GEYAFLGAGAVINKNVP S YALMVGVPARQ I G WI ANS V S 
SCS" 
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CDS 6152.. 6982 

/gene= M wbpE" 
/codon_start=l 
/product= "WbpE " 
/db_xref="PID:gl545851' 
/ trans l_table= 11 



/translations "MIEFIDLKNOQARIKDKIDAGIQRVLRHGQYILGPEVTELEDRL 
ADFVGAKYCISCANGTDALQIVQMALGVGPGDEVITPGFTYVATAETVALLGAKPVYV 
DIDPRTYNLDPQLLEAAITPRTKAI I PVSLYGQCADFDAINAIASKYG I PVIEDAAQS 
FGAS YKGKRSCNLSTVACTSFFPSKPLGCYGDGGAI FTNDDEL.ATAIRQ I ARHGQDRR 
YHHIRVGVNSRLDTLOAAILLPKLEIFEEEIALRQKVAAEYDLSLKQVGIGTPFIGSG" 
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CDS 7236.. 8552 

/gene=*wzy (rfc) M 
/ codon_start= 1 
/product=*Wzy (Rfc) M 
/db_xref ="PID:g2545852 • 
/ trans l_table= 1 1 

/ tran slation- "KYIIARVDRSILLNTVTjLF^ 

KTVDFGL Y P YLMVLAL I C ALLCGG A I RRPGDLLVT LLW I L VPH S LVLNG ANQ YS PDA 
QPWAGVPLAIAFGILIIGIVNKIRFHPLGALQRENQGRRMLVLLSVLNIWLVFIFFK 
SAGYFSFDFAGOYARRALAREVFAAGSANGYLSSIGTQAFFPVLFAWGVYRRQWFYLV 
LGIWALVLWGAFGOKYPFWLFLIYGLMVYFRRFGQVRVSWWCALLMLLLLGALEH 

EVFGYSFI^DYFLRRAFIVPSTLLGAVDQFVSOFGSNYYRDTLLGALLGQGRTEPLSF 
R^GTEIFNNPDMNANWFFAIAYMQI^^ 

PVALLFTTKILEOPLLTVMLGSGVFLILLFLALISFPLKMSLGKTL" 
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CDS 8549.. 9499 

/gene= M wbpF" 
/codon_start=l 
/ pr odu ct=" WbpF w 
/db_xref="PID:gl545853" 
/ trans l_table=ll 

/ translation " MS AAF INRVARVL VGTLGAQL I T IG VTLLLVRL YS PAEMGAFSV 
WLSFATIFA\AATrGRYELAIFSTREEGELQAIVKLILQLTLLIFVAVAIAWIGRHLI 
ESMPWIGEYWFAIjAVAS lglg inklvlslltfqq SFNRLGVARVS laac I avaqvsa 

AYLLEGVSGLIYGOLFG\AA^ATAI^LWGKSLILNCIETPWRMVRQVAVQYINFPKF 
SLPADLVNTVASQVPVILLAAKFGGDSAGWFALTLKIMGAPISLLAASVLDVFKEQAA 
RDYREFGNCRGIFLKTFRLLAVLALPPFI IFGSLASGPLG - 
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CDS 9831.. 10388 

/gene=-hisH" 
/codon_start=l 
/products "HisH" 
/db_xref="PiD:gl545854 " 
/ trans l_table= 11 

/translation=-MLKRVGAKAKASDSREDI EOAE KLILPGVGAFDAGMOT^RKSGL 

VDVLTEQVMIKRKPVMGVCLGSQMLGIjRSEEGAEPGIjGWIDKDSVRFERRDDRKVPHM 

GWNQVSPQLEHPILSGINEOSRFyFVHSYYMVPKDPDDILLSCNyGOKFTAAVARDNV 
FGFQFHPEKSHKFGMOLFKNFVELV ■ 
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CDS 10388.. 11143 

/gene="hisF" 
/codon.start = 1 
/product="HisF" 
/db_xref= t, PID:gl54 5855 " 
/ trans l_table= 11 



/ translations u MVRRRVIPCLLLKDRGLVKTVKFKEPKYVGDPINAIRI FNEKEV 
DELI LLDI DASRLNQEPNYELI AEVAGECFMP I CYGGG I KTLEHAEKI FSLGVEKVS I 
NTAALMDLSLIRRI ADKFGSQS WGS I DCRKGFWGGHSVFSENGTRDMKRS PLEWAQA 
LEEAGVGE I FLNS I DRDGVQKGFDNALVENI ASNVHVPVI ACGGAGS I ADL I DLFERT 
CVSAVAAGSLFVFHGKHRAVLI SYPDVKKLDVG " 
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CD S 11281.. 12411 

/gene= " wbpG" 
/codon_start=l 
/products "WbpG " 

/db_xref="PID:gl 545856" 
/trans;_table=ll 

/ t rans 1 a t i on = " MKICSRCVMDTSDAEI VFDEAGVCNHCHKFDNVQSROLFSDASG 
BQRLQKI IGQIKKDGSGKDYDCI IGLSGGVDSSYLAVKVKDLGLRPLXAmVDAGWNSE 
LAVSNIEKIVKYCGFDLHTHVINWEEIRDLQLAYMKAAVANODVPQDHAFFASMYHFA 
VKNNIKYILSGGNLATEAVFPDTWHGSAMDAIKLKAIHKKYGERPLRDYKTISFLEYY 
FWYPFVKGMRTVRPLNFMAYDKAKAETFLQETIGYRSYARKHGESIFTKLFONYYLPT 
KFGYDKRKLHYSSMILSGQMTRDEAQAKLAEPLYDADELQFDIEYFCKKMRITQAOFE 
ELMNAPVHDYSEFANWDSRORIAKKVQMIVQRALGRRINVYS" 
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CDS 12427.. 13548 

/gene= M wbpH " 
/codon_start=l 
/products " WbpH" 
/db^xref = - PID : gl54 5857 " 
/ trans l_table=ll 



/ translation "MTKVAHLTSVHSRyDIRIFRKQCRTLSOYGYDVYLWADGKGDE 
VKDGVR I VDVGVLSGRLNRI LKTTRKI YEQALALGADVYH FHDPEL I PVGLRLKKQGK 
QVIFDSHEDVPKQLLSKPYMRPFLRRWAVLFSCYEKYACFKLDAVLTATPHIREKFK 
NINGNVLDINNFPMLGELDAMVPWASKKTEVCYVGGrTSIRGVREWKSLECLKSSAR 
LNLVGKFSEPEIEKEVRALKGWNSWEHGQLDREDVRRVLGDSVAGLWFLPMPNHVD 

AOPNKMFEYMSSGIPVIASNFPLWREIVEGSNCGICVDPLSPAAIAEAIDYLVSNPCE 
AAALGRNGQRAVNERYNWDLEGRKLARFYSOLLSKRDSI" 
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CDS 13545.. 14633 

/gene="wbpl" 
/codon_start=l 
/product= "Wbpl " 
/db_xref=-PlD:gl545858" 
/transl_table=ll 

/translati on = " MK I LTI IGARPQF 1KAS WSKAI I EQQTLSEI I VHTGQHFDANK 
SEIFFEQLGIPKPDYQLDIHGGTHGQMTGRMLMEIEDVILKEKPHRVLVYGDTNSTLA 
GALAASKLHVPIAKIEAGLRSFNMRMPEEINRILTDQVSDrLFCPTRVAIDNLKNEGF 
ERKAAKIVNVGDVKQDSALFFAQRATSPIGLASODGFILATLHRAENTDDPVRLTSIV 
EALNEIQINVAPWLPLHPRTRGVIERLGLKLEVQVIDPVGYLEM1WLLORSGLVLTD 

SGGVQKEAFFFGKPCVTMRDQTEWVELVTCGANVLVGAARDMIVESARTSLGKTIQDD 
GQLYGGGQASLGLLNILPSCDALRVEFK" 



SUBSTITUTE SHEET (RULE 25) 



WO 97/41234 



PCI7CA97/00295 



23/62 
FIGURE 16 

CDS 14651.. 15892 

/gene= " wbpJ" 
/ codon^s tar t = 1 
/product= "WbpJ " 
/db_xref="PID :gl5458 59" 
/ trans l_table= 1 1 

/ 1 rans la t ion= M MN\AmmPYAGGPGVGRYWRPYYFSKFWNQAGHRSVI I SAGYHH 

LLEPDEKRSGVTCVNGAEYA WPTLRYLGNGVGRMLSML I FTMMLLPFCL I LALKRGT 

PDA 1 1 YS S PH PFGWSCWLAARLLGAKFVFEVRDI WP LS LVELGGLKADN PLVRVTGW 

I ERFS Y ARADKI I S LLPCAEPHMADKGLPAGKFLWVPNGVDS SDI S PDSAVSS SDLVR 

HVQVLKEQGVFWIYAGAHGEPNALEGLVRSAGLLRERGAS IRI ILVGKGECKEQLKA 

IAAQDASGLVEFFDQQPKETIMAVLKLASAGYISLKSEPIFRFGVSPNKLWDYMLVGL 

PVIFACKAGNDPVSDYDCGVSADPDAPEDITAAIFRLLLLSEDERRTMGQRGRDAVLE 
HYTYESLALQVLKALADGRAA " 
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CDS 15889. .16851 

/gene= "wbpK" 
/codon_start=l 
/product = " WbpK u 

/db_xref = M PlD:gl545860 M 
/transl_table=ll 

/translation^MKAVMVTGASGFVGSALCCELARTGYAVIAWRRWERIPSVTy 
IEADLTDPATFAGEFPTVDCIIHIAGRAHILTDKVADPLAAFREWRDATVRLATRAL 
EAGVKRFVFVSSIGVNGNSTRQOAFNEDSPAGPHAPYAISKYEAEQELGTLLRGKGME 
L.VWRPPLIYANDAPGNFGRLLKLVASGLPLPLDGVRKARSLVSRRNIVGFLSLCAEH 

PDAAGELFLVADGEDVSIAOMIEALSRGMGRRPALFTFPAVLLKLVMCLLGKASMHEQ 
LCGSLQVDASKARRLLC3WVPVETIGAGL0AAGREYILRORERRK" 
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CDS 19678.. 21675 

/gene="wbpM M 
/codon_start=l 
/product^ "WbpM" 
/db_xref="PID:gl54S862 
/ trans l_table= 1 1 

/ trans la t ion= " MLDNLRIKLLGLPRRYKRMLQVAADVTLVWLSLWLAFLVRLGTE 

DMISPFSGHAWLFIAAPLVAIPLFIRFGMYRAVMRYLGNDALIAIAKAVTISALVLSL 

LVYWYRSPPAVVPRSLVF^^XV^WLSMLLIGGLRLAMRQYFMGDWySAVQSVPFLNRODG 

LPRVAIYGAGAAANQLVAALRLGRAMRPVAFIDDDKQIANRVIAGLRVYTAKHIROMI 

DETGAQEVLLAI PSATRARRREI LESLEPFPLHVRSMPGFMDLTSGRVKVDDLQEVDI 

ADLLGRDSVAPRKELLERCIRGQWMVTGAGGSIGSELCRQIMSCSPSVLILFEHSEY 

NL YS I HQELERRI KRES LSVNLLP I LGS VRNPERLVDVMRTWKVNTVYHAAAYKHVPI 

VEHNXAEGVLNi^JVIGTLHAVOAAVOVGVQNFVLISTDKAVRPTNVMGSTKRLAEMVLQ 

ALSNESAPLLFGDRKDVHHVNKTRFTMVRFGNVLGSSGSVIPL.FREQIKRGGPVTVTH 

PSITRYFMTIPEAAQLVIQAGSMGQGGDVFVLDMGPPVKILELAEKMIHLSGLSVRSE 

RSPHGDIAIEFSGLRPGEKLYEELLIGDNVNPTDHPMIMRANEEHLSWEAFKWLEQL 

LAAVEKDDYSRVRQLLRETVSG YAPDGEI VDWI YRQRRRE P " 
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CDS 22302.. 23693 

/gene= " wbpN" 
/ codon_s tar t = 1 
/ product = * WbpN " 
/db_xref ="PlD:gl545863 " 
/transl_cable=li 



/ trans la tion= "MINSHLLYRLSYRGTARRMLLIKKGKPLPMTSPFSLODLDDGLG 

DGLQVRFVORGDADTAGADGVDTELGLOALDLVGGQAGIGEHATLATDETEVALGAVG 

CQLLDKRQAHVADAVAHLAQFLLPEGPQFRAVEHGGDDAGAVGRWVRIVGADHPI.HU3 

OHAGRFIAAFGHDREGADAFAIEREGFGERAGNEEAQARLGEQAHRGGVFLDAVAEAL 

VGDVEERHVALGLEHVOHLFPWQLEIDAGRIMAAGVQNHDRAGROGIQVFQQAGAVH 

AIAGGWIAWLHREAGGFEQCAWFPARVADGHGGVGQQALEEVGAELERAGAADGL 

GRDKTAGGQQLGLVTEQQFLYALWGGDPFDRQVAARRVGLDAGLLGSLKGTQQRNAP 

LLVWHAHAQVDLARTGIGVEGFVQAKDGITRCHFDGRKOTHFAAARSVKRGG0RNPL 
CGGAKGCANGGLL " 
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CDS 23704. .>24417 

/gene="uvrB" 
/codon_start=l 
/ produc t = " UvrB " 
/db_xref= M PID:gl54 5 864" 
/ trans I__table= 11 

/ translation= " MHAATFRCMLSAI SDAGFS LAS QLPARFFMDT FOLDS RFKPAGD 

OPEAIRQMVEGLEAGLSHQTLLGVTGSGKTFSIANVIAQVQRPTLVLAPNKTLAAQLY 

GEFKTFFPHNSVEYFVSYYDYYOPEAyVPSSDTyiEKDSSINDHIEOMRLSATKALLE 

RPDAIIVATVSSIYGLGDPASYLKMVLHLDRGDRIDQRELLRRLTSLOYTRNDMDFAR 
ATFRVRGDVIDIFPAESDLE" 
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CDS 16911.. 17822 

/gene="wbpL n 
/codon_start=l 
/products "WbpL" 
/db_xref = " PID: gl54 5861 - 
/transl_table=ll 

/ translations "^^^^^^ A ^^^^^LFSFVATWGLRRYAIATKLMDVPNARSSHSQP 

TPRGGGVAIVLVFLAALVWMLSAGSISGGWGGAMLGAGSGVALLGFLDDHGHIAARWR 
LLGHFSAAIWILLWTGGFPPLDVVGHAVDLGWL^ 

ASVEIAIGVCVGGALIYWLTGHVAMVGIPLLLACAVAGFLIWNFPPARIFMGDAGSGFL 

GMVIGALAIOAAWTAPSLFWCWLILLGVFIVDATYTLIRRIARGEKFYEAHRSHAYQF 
ASRRYASHLRVTLGVLAINTLWLLRWH - 

source 17935. .19144 

/organism= - Pseudomonas aeruginosa " 
/insertion_seq= "IS1209(?A) " 
/s train= M PA01 " 
/serotypes - 05 " 
misc_feature 18032.. 19141 
/notes "IS407" 
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Serotype 02. 



^)-B-D-Man(2NAc3mKl^^ 
CH 3 C=NH 

Serotype 05. 

^M^D-Man(2NAc3m^l^^ 
CH 3 C=NH 

Serotype 016. 

^H^D4*an(2NAc3N)A^ 
CH 3 C=NH 



Serotype O20. 

^Hx-L-Gul(2NAc^ 

CH 3 C=NH -70% and J Ac 

^)-B-D-Man(2N^ 

CH 3 C=NH -30% QAc 
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E. coli a c.a...t TTGACA. . t 

PSbA c.t...t TTGtgA. . a 

bis** t.a...t TTGcCc . . c 

PsbG c.a...c TTGgCA . . g 

IS407-1 c.t...g TTGgCA.. c 

IS407-2 g.t...t TTGgCg . . c 

1S407-2 t.a...g TTGAtg . . a 

PsbN t . g . . . c TTGctg . . a 



17bp ggTATAATg 

18bp cgcAgAAag 

16bp gcTtTgtTg 

16bp tcaAgAtTg 

17bp agTtTgcTg 

17 bp acTAagcag 

17bp acTAcctag 

17bp cggATcgTc 
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GTG 
ATG 

AAA AAA TTAA 

Jiiii 111111 mi Spaces between RBS 

0987654321098765432101234567890123 and first codon 

psbA aaa tt.QA.<3G.TGAgttggAAAATgatagatgTTAA 8 

psbB tcatttccat^CSSAcgaaccATgAAAaatttcgc 6 

psbC <=tttggcAA<3ctgcagcgtaATgt.tgtgcacTTc io 

psbD tcgagtgt.QA.GtctcaagccAT2agttattaTcA 9 

psbE a gcAAGG.fc&GacgtgtgaccAT2attgaatTcAt io 

ct sr c g t tgac.(3AattgacggATgtatatatactt 8 
psbF at 5tctttAC3jSAaaaactctA^agtgcggcTtt 8 

htsH tgtgccaag.«3GAGaTGccaaS^atcgttgTTAt 7 

huF aa cttcgt.QGA£cttgtctgATggtccggaggcg 8 

psbC tgcCtcg.QGAGSTtgttgTgATgAAAgatcTgtt 
psbH c STtgatgaccggggccgctcATgactAAAgTTgc 
psbl ctgagtaagc.QA.Qattccat ATCAA AatitgTrrAr. 
psbJ taa^CSSAtttatttagttccATgaacgtctggtA 
psbK ct tgctgatg.qGcgcgcagcATCAAAgctgTcAt 
gaacggggc t.qA.TaaataggATgt tgga taaTt t 
ggactcgaaccAGGgacccaATgattaacagTcA 
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7 
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K - tuple value : i 
Gap penalty : 5 
Window size -10 
Filtering level: 2.5 
Open gap cost ; 10 
Unit gap cost : 10 

f««t n9 ° f other P*rameter« 

C^ra^eTto^Z- rZ n t ° n 3 * rotein •^•nce.. 

a^ss s s: ss : sffis a a-js=s*j-.r— «r -— -■ 

Alignment 



PSBA 
EC RFFD 

bs'epsd 



PSBA 
EC RFFD 
BS~EPSD 



PSBA 
EC RFFD 
BS^EPSD 



PSBA 
EC RFFD 
BS EPSD 



PSBA 
EC RFFD 
BS_EPSD 



PSBA 
EC RFFD 
BS^EPSD 



PSBA 
EC RFFD 
BS EPSD 



PSBA 
EC RFFD 
BS_EPSD 



PSBA 
EC RFFD 
BS EPSD 



PSBA 
EC RFFD 
BS"EPSD 



MIDWT^KF^ROALIGIVGMYvglpj^v^jq 

M-^r»»£??" * " ' ^^i^viglptaaafasrqkov^toiS 

M ^IEIDreT.--.ISWGWIGLWATVXASRORELI^ 

* *..• ***•. # • * 



so 

38 
44 



99 
86 
92 



149 
135 
141 



X91 
185 
191 



241 
235 
241 



291 
281 
287 



337 
331 
329 



366 
378 
375 



435 
420 
422 



A 436 
420 
A 423 



Consensus length: 451 
Identity . m { 2 4.€%) 
Similarity; 154 { 34.1%) 



Dictionary of the sequences used for the al 



ignment 



( 1) PSBA 

Size: 436 residues. 

I 2] EC_RFFD 

Size 1 420 residues. 
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FIGURE 36 



Setting of computation parameters 



K- tuple value 
Gap penalty 
Window size 
Filtering level 
Open gap cost 
Unit gap cost 



1 
5 

10 
2,5 
10 
10 



J"!«L° f ° ther P arame ters 
Alignment 



PSBE 

BP_BPLC 

BS DEGT 

S_ERYC1 

S_DNRJ 

BS_SPSC 



PSBE 

BP_BPLC 

BS_DEGT 

S_ERYC1 

S_DNRJ 

BS SPSC 



PSBE 

BP_BPLC 

BSJDEGT 

S_ERYC1 

S_DNRJ 

BS SPSC 



PSBE 

BP_BPLC 

BS_DEGT 

S_ERYC1 

S_DNRJ 

BS^SPSC 



2 - SSSH^K^^ 1 ^^ 





GEGGMLTT 



48 
47 
49 
49 
49 
50 



98 
97 
99 
98 
99 
100 



148 
147 
149 
148 
149 
150 



198 
197 
199 
198 
199 
199 
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PSBE 

BP_BPLC 

BS DEGT 

S_ERYCl 

S_DNRJ 

BS_SPSC 



PSBE 

BP BPLC 

BS_DEGT 

S_ERYCl 

S_DNRJ 

BS_SPSC 



PSBE 

BP_BPLC 

BS^DEGT 

S_ERYCl 

S_DNRJ 

BS_SPSC 

PSBE 

BP_BPLC 

BS_DEGT 

S_ERYC1 

S_DNRJ 

BS_SPSC 



FIGURE 3* (<"™»M) 

SSK^XSg: : : : :::::: SE^Y-^j^qa* 2 3 * 

NDDE1AEKCRVIRVHG -iS25S^ir"^S^ ,DTLQCA 235 

PDAEVDRRLRRLRYYG m5?^S(^S TNAR1 * DEL0AA 236 

^J^IFEEEIALRQKVAAEY DLS 



VWVRHPE v -GVQTXIHYPTPVHLSPAYA- DLGL 

^^^^^ISYPWPVHTMSGPA-HLGY 

IPVHIHPYYQKQFGY 



LKQV-GIGTPFI • 



LYVLQVDEKKAGVTRSEMITALTOEYNIGTSVHF- 



GSG- 



276 
366 
372 
365 
370 
389 



273 
325 
328 
325 
327 
345 



Consensus length: 3 94 
Similarity: 83 ("21.1%) 



t 1] PSBE 

Size: 276 residues. 

I 2] BP_BPLC 

Size: 366 residues. 

[ 3 ] BS_DEGT 

Size: 372 residues. 

f 4] S_ERYC1 

Size: 365 residues. 

I 5) S_DNRJ 

Size: 370 residues, 
f 6] BS_SPSC 

Size: 3BS residues. 
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FIGURE 37 



Proerrarr. C^iftp | 

Hydropathy index coaputation for aequanee PSBF. 

Total number of amino acids is: 316. 



SB 



48 " 




-48 

-58 ~ 
1 



| 1 1 1 1 1 I I I i I i i i i | | | 
68 12a 



IBB 24b 1 ' 



388 



Htfdropa^ic i«d-x of psbF fro- «ino .eid 1 xo *» ino Mld 31( . 
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FIGURE 3ft 

Setting of computation parameters 



K- tuple value 
Gap penalty 
Window size 
Filtering level 
Open gap cost 
Unit gap cost 

Setting of other parameters 



1 
5 

10 
2.5 
10 
10 



Alignment 

EC KFRC MK- - -VLTVFTOpS^^^t t^ LTO ^^ TG Q^EANMSDV 49 



48 



* . ** . * 

BP"*BPLD SSSoSfuS^J " " HGGTHGQMTGRMLME I EDVT LKEKPHRVL VYr 

EC>PR C ^f^ 0 ?!™?^ 1 - -HC^HGDMTGRMLVAl^^^v^^^5 

BS_0RFX 

SB_RFBC VLDFFEI 

PA_PSBI 
BP_BPLD 
EC_NFRC 
BS_ORFX 
SB RFBC 



VLKLFS I - VPDyD^IMOP^^^2^ VALEQVMQAEKPD VVLVyG 97 



PA_PSBI 
BP^BPLD 
EC_NFRC 
BS_ORFX 
SB_RFBC 



PA_PSBI 



mm * ## ^ 1WWWN SPFPEEGNRQLTSK1AFF 145 
HFS PTETSRQn£2re - - NwSsR^^^ £„ 

S ° D "-" G "- FI ^ T1J1R ^DDPV R1( TSIVi y ^ IQINVA . ... 
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GY- -SHPVOJXJVGEDKMIl^TAHRRENLGEP- -MENMFKA^rV^Sn III 

EIISI^K I ^..DICKIILVTLl^QGEI,--S^^iSS 239 



- PWLPLH - - PRTRGVI ERLGLKLE VQVIDPVGYLEMTWT T nDorr 

II 

SB.RFBC lEIVFPV^reiREVVNE-.-KI^^IKLVEPlAYPG?^ 



MS !=Sffi=SS ill 

SCKKLLSDERLYEKMSQAGNPFGDGKASKKILD-tTr^rp^^DI--- 37? 



PA_PSBI FK 362 

BP_BPLD -H 362 

EC_NFRC 376 

BS_ORFX GK 380 

SBLRFBC -K 378 



Consensus length: 402 
Identity : 71 { 17.7V) 
Similarity: 109 ( 27.1%) 

f ° r the ali ^ nm< 

[ 1] PA_PSBI 

Size: 362 residues. 

[ 2] BP_BPLD 

Size: 362 residues. 

[ 3) EC_NFRC 

Size: 376 residues. 

I 4] BS_ORFX 

Size: 380 residues. 

I 5] SB RFBC 
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FIGURE 39 



f« tting of com P uta tion parameters 

K- tuple value : i 
Gap penalty : 5 
Window size : 10 
Filtering level: 2.5 
Open gap cost : 10 
Unit gap cost : 10 

Setting of other parameters 

The alignment was done on 3 Protein sequences. 

Sllttttl to Sow t£it 1 in th ? is perfectly conserved 

wwcx lo snow cnac a position is well conserved: '.' 

Alignment 



BP-SIX J^^P^^PGVGRYWI^yyFSKFWNQAGHRSVllSAGVHH^ 50 

YE-?Rsi 2?"; FRPYYFGREWIGHGHOVKVAASTISHIRARAP 34 

— 11 X - - P&mnRIUTraf 



82 

FEVKKI 4 9 



Sis SsfSiE ?ss ^-— sags i?i 

J.RKFKPDIV- -HSHMFHA- - ML FAR I LRVFTK I PAL I CTAHNT 88 



YE_TRSE NEGS S I^MLAYKYTDKLAS LSTNV S QD AV - - -DSFIHKG^TGRMIaJI "5 



PA_PSBJ 
BP_BPLE 
YE*~TRSE 



PA_PSBJ 
BP_BPLE 
YE TRSE 



^^^S^ISSS^SSSSS^SSSSSi lit 

NGIDASQF- - - - DFSMDERKVKRSELG I FNDTP 1 1 LSV - -GRLTEATOYP 1*1 

' ' : ... . * 

2™^^^°^^" ■ - IILV GKGECKEQLKAIAAQDAS-GLVEFFD 290 

NLLTAFS LL I KDNSLQS FPQLF I VGTGHLDG YLKNMSKEre I DKYVTLFG 229 
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FIGURE 39 (Cont'd) 



PA_PSBJ 
BP BPLE 
YE~TRSE 



QQ PKET I MAVUCLAS AG Y I S LKS E P I FRFGVS PNKLWD YMLVGLP V I FAC 34 0 

PVPRPAVQAVMADIDAAYIGUWSPLFQFGVSPNIOJDYMLSACPVVQS I 325 
Q- -RDDILQLMCAADI - FVLSSEWEGFPLVITEA MACKKIIVAT 270 



PA_PSBJ 
BP_BPLE 
YE TRSE 



PA_PSBJ 
BP_BPLE 
YE TRSE 




LEHYTYESIiALQVIiNAlADG RAA- * 413 

LARHDYPVLAQQFIiDAVOSVTPRRAASR 403 
I QTNS I EKI IE- - LGCLFI LNLKNNC - - 344 



Consensus length: 42B 
Identity ; 30 { 7%) 
Similarity: 132 { 30.8V) 

""t^-'T ° f th€ se< 5 uences used f or the alignment 

[ 1) PA_PSBJ 

Size: 413 residues. 

[ 2] BP_BPLE 

Size: 4 03 residues. 

I 3] YEJTRSE 

Size: 344 residues. 
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FIGURE 40 

Setting of computation parameters 



K-tuple value 
Gap penalty 
Window size 
Filtering level 
Open gap cost 
Unit gap cost 

Setting of other parameters 



1 
5 

10 
2.5 
10 
10 



chS™ii 9nment w as done on 3 ^otein sequences 

asss it ss as : sags s "bc-v?--* 

Alignment 
PA PSBL 



PA_PSBL 




46 



PA PSBL 

2S5T XSSSg^^^SS^SSS iSS 



92 



140 



PA PSBL 

SS5T S»S^^™s^53b£ Hi 

186 

230 
233 
236 




JgSSSK:::::jag5SSSS!iS «™««°»a 



HI RFE 



263 
281 
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FIGURE 40 (Cont'd) 



PA_PSBL 
YEJTRSF 
HI RFE 



YASHLRVTLGV1AINTLWLL- -r 




r iUUU^FVTLSAI AIKI IWLF - - P IALLAGLNIVNPI IALI IS Y I PLLYI - 
AGLTSRQAFLL I TFVS AVCAT I G I L»GE VYYVNEW - AMFVGFP I LFFLY 



301 
330 
328 



PA_PSBL 
YEJTRSF 
HI RFE 




— WH 303 
VNND 341 
LLKKA 355 



Consensus length: 377 
Identity : 55 { 14.6%) 
Similarity: 98 { 26%) 

2i--i-2-^.SLl5i e 8equences used for the ali 

[ 1] PA_PSBL 

Size: 303 residues. 

[ 2] YEJTRSF 

Size: 341 residues. 

[ 3] HI_RFE 

Size: 355 residues. 
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FIGURE 41 

Setting of computation parameters 



K- tuple value 
Gap penalty 
Window size 
Filtering level 
Open gap cost 
Unit gap cost 



1 
5 

10 
2.5 
10 
10 



Setting of other parameters 

Ch»^li 9nment ? aS done on 4 Pr °tein sequences. 

ssss it as as : sags a sax-ss^f .r*-* 1 * — «- 

Alignment 

SISAKLRFLILIIIDSFIVTFSVFLGYAI - --LEPYFK 37 



... * * 

■ • • ... 

PSBM 
TRSG 



2S2^^ 14 5 

• • - * ** * 



SA~CAPn FPADHHMA DPRTPVLX YGAGGAGSQIAMALRTGPHYR - - PVAMTn , Z, 



174 



•MA 273 
* * 
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TRSG 



PSBM 
TRSG 
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FIGURE 41 (Cont'd) 



PSBM LLERCI RGQWMVTGAGGS IGSELCRQ IMSCSPSVLI LFEHSEYNLYS I H 34 0 

TRSG LLAKNITGKWMVTGAGGS IGSELCRQ 1 I VEKPSLLILFDISEFSLYS IE 320 

B P_BPLL LLGRCVTDRWMVTGAGGS IGSELCRQ I LALRPRKLVLFE IAEPALYAIE 327 

SA_CAPD LISRELTNKTILVTGAGGSIGSEICRQVSKFDPQKIILLGHGENSIYSIH 323 

* ************** * * * * * 



QELERRI KRESLSVNLLP ILGSVROTERLVDVMRTWKVNTVYHAAAYKHV 390 

NEMAAICKKNKIETEFVAI^GSVQSEKRLVQIMSNFHVNTVYHAAAYKHV 370 

BP_BPLL QDLRQR I GERN I E I A - - GVLGSVRDAAHCLAQLQEHGVQTI YHAAAYKHV 375 

SA_CAPD QELSKTYGNR - - -IEFVPVIADVQNKTRILEVMNEFKPYAVYHAAAHKHV 370 

• * • • • .,******** 

PSBM P I VEHN I AEGVLNNVI GTLHAVQAAVQVGVQNEVL I STDKAVRPTNVMGS 4*4 0 

Z~ SG . PLVENNVIEGVRNNIFGTLYCAKAAIKSGVEKFVLISTDKAVRPTNTMGA 420 

??-5^ L P I VEHNVS EG I RTNAFGTLNMAETAI Q AG VLDFVL I STDKAVRPTNVMGA 425 

SA_CAPD PLME YNPHEAI RNN I LGTKNVAES AKEGE VSKFVMI STDKAVNPSNVMGA 420 

*.-* * *.. .* .** * * **.******* * * ** 



PSBM TKRLAEMVLQALSNESAPLLFGDRKDVHHVNKTRF1WVRFGITVLGSSGSV 4 90 

TRSG TKRMAELVLQALSTEQ NKTKFCMVRFGNVLGSSGSV 456 

^^? LILQA HAQIQDKTRFSMVRFGNVLGSSGSV 461 

SA_CAPD TKR I AEMV I QS LNEDNS - - - - KTSFVAVRFGNVLGSRGSV 456 



PSBM I 



'*•**...*. ** . * *********. **i 

PLFREQ I KRGGPVTVTHPS I TRYFMT I PEAAQLVI QAGSMGQGGDVFVL 
TLFKKOI AEGGPTTT.TWK'nT TPVI?MTT r>TTR a rvr uT^^ri^u^^^^^ 



~ ~~~ " * * «*r sx£.w a *wwr v i v x ttr& x I KxrMT i. FEAAQL/vI QAGSMGQGGDVFVL 540 

J? ^ VPLFKKQ I AEGGP I TLTHKD 1 1 RYFMT I PEAAQLVI QAGAMGQGGDVFVL til 

SA_CAPD I PLFKNQIESGGPVTVTHPEMTRYFMTI PEASRLVLQAGALAQGGEVFVL III 

• ** ★**.*.** ...****■#****..**.*** * **** 



55? ^*^J*BIAEKMIH^^ 590 

I^nnr x DMGDPVKI IDLAKRMINLSGLS I KSEENLDGDIAIEI SGLRPGEKLYEEL 556 

SHSt DMGEPVL I RELAERMVRLYGLTVKNSDQ PDGD I E I R I TGLRPGEKLYEEL til 

SA.CAPD DMGKPVKIVDLAKNLIRLSG KKEEDIGIEFSGIRPGEKLYEEL HI 

*** ** * •**.--..* * ..**.* ..*.********** 

LIGDNVNPTDHPMIMRANEEHLSWEAFKVVLEQLIJUVVEKDDYSRVRQL^ 64 0 
BP opt t t* t SSSYe *^^^ E * **^EWDDLN I LLNKI ETACNDFNYEC I RSLL 606 

SA_CAPD LNKNEIHPQ- - QVYEKIYRGKVDHYIKTEVDLIV 581 

PSBM RETVSGYAPDGEI VDW I YRQRRRE P 665 

T*? G LEAPTGFQPTDGI CDWWQKTHSENAKNVIVH 638 

BP_BPLL GQIVREYAS VTYA 624 

SA_CAPD EDLINNFS - KEKLLKIANR 599 

Consensus length: 682 
Identity : 154 ( 22.6V) 
Similarity: 185 ( 27.1%) 

Dictionary of the sequences used f r the alignment 



I 1] PSBM [ 3] BP_BPLL 

Size: 665 residues. Size: 624 residues. 

( 2] TRSG ^ r 4] SA_CAPD 

Size: 636 residues. Size: 599 residues. 
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FIGURF 47 

Entire sequence of rol gene: 

Tmnx^cnAcrceiG^^^ 

G AAGA GCTTC GTTCG GTTTCG CGTGATCCGCTTn?^^ 

^GG<KiAAAGACTTIXXOG<Trc^A-irS^^ ^ AA ^ GCTn ^^ A ^GATAAGTATrAnT^ 
<^GTAAGGCGATrATGGCOOAA^^ 
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FIGURE 44 
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FIGURE 45 
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FIGURE 47 
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FIGURE 48 
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